[ceph-users] running Firefly client (0.80.1) against older version (dumpling 0.67.10) cluster?
Anyone know if this is safe in the short term? we're rebuilding our nova-compute nodes and can make sure the Dumpling versions are pinned as part of the process in the future. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-deploy with --release (--stable) for dumpling?
ceph-deploy --release dumpling or previously ceph-deploy --stable dumpling now results in Firefly (0.80.1) being installed, is this intentional? I'm adding another host with more OSDs and guessing it is preferable to deploy the same version. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy with --release (--stable) for dumpling?
On Tue, Aug 26, 2014 at 5:10 PM, Konrad Gutkowski wrote: > Ceph-deploy should set priority for ceph repository, which it doesn't, this > usually installs the best available version from any repository. Thanks Konrad for the tip. It took several goes (notably ceph-deploy purge did not, for me at least, seem to be removing librbd1 cleanly) but I managed to get 0.67.10 to be preferred, basically I did this: root@ceph12:~# ceph -v ceph version 0.67.10 root@ceph12:~# cat /etc/apt/preferences Package: * Pin: origin ceph.com Pin-priority: 900 Package: * Pin: origin ceph.newdream.net Pin-priority: 900 root@ceph12:~# ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD journal deployment experiences
On Fri, Sep 5, 2014 at 5:46 PM, Dan Van Der Ster wrote: >> On 05 Sep 2014, at 03:09, Christian Balzer wrote: >> You might want to look into cache pools (and dedicated SSD servers with >> fast controllers and CPUs) in your test cluster and for the future. >> Right now my impression is that there is quite a bit more polishing to be >> done (retention of hot objects, etc) and there have been stability concerns >> raised here. > > Right, Greg already said publicly not to use the cache tiers for RBD. I lost the context for this statement you reference from Greg (presumably Greg Farnum?) - was it a reference to bcache or Ceph cache tiering? Could you point me to where it was stated please. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Monitors not reaching quorum. (SELinux off, IPtables off, can see tcp traffic)
On Wed, Jun 3, 2015 at 8:30 AM, wrote: > We are running with Jumbo Frames turned on. Is that likely to be the issue? I got caught by this previously: http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-October/043955.html The problem is Ceph "almost-but-not-quite" works, leading you down lots of fruitless paths. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] anyone using CephFS for HPC?
Wondering if anyone has done comparisons between CephFS and other parallel filesystems like Lustre typically used in HPC deployments either for scratch storage or persistent storage to support HPC workflows? thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] anyone using CephFS for HPC?
On 12/06/2015 3:41 PM, Gregory Farnum wrote: ... and the test evaluation was on repurposed Lustre hardware so it was a bit odd, ... Agree, it was old (at least by now) DDN kit (SFA10K?) and not ideally suited for Ceph (really high OSD per host ratio). Sage's thesis or some of the earlier papers will be happy to tell you all the ways in which Ceph > Lustre, of course, since creating a successor is how the project started. ;) -Greg Thanks Greg, yes those original documents have been well-thumbed; but I was hoping someone had done a more recent comparison given the significant improvements over the last couple of Ceph releases. My superficial poking about in Lustre doesn't reveal to me anything particularly compelling in the design or typical deployments that would magically yield higher-performance than an equally well tuned Ceph cluster. Blair Bethwaite commented that Lustre client-side write caching might be more effective than CephFS at the moment. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] EC pool needs hosts equal to k + m?
I recall a post to the mailing list in the last week(s) where someone said that for an EC Pool the failure-domain defaults to having k+m hosts in some versions of Ceph? Can anyone recall the post? have I got the requirement correct? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] EC pool needs hosts equal to k + m?
On Wed, Jun 24, 2015 at 4:29 PM, Yueliang wrote: > When I use K+M hosts in the EC pool, if M hosts get down, still have K hosts > active, Can I continue write data to the pool ? If your CRUSH map specifies a failure-domain at the host level (so no two chunks share the same host) then you will be unable to write to the pool. If instead the failure-domain is OSD then with enough OSDs pool writes would still be accepted. > Since there only have K > hosts, not K+M hosts, When client write a data to EC pool , Primary OSD will > split the data to K data pieces,but how about the M coding pieces? is it > still be calculated and where it should be hold ? Same as above, with failure-domain = host then there would be nowhere to put the M coding pieces. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Configuring Ceph without DNS
> On 13 Jul 2015, at 4:58 pm, Abhishek Varshney > wrote: > I have a requirement wherein I wish to setup Ceph where hostname resolution > is not supported and I just have IP addresses to work with. Is there a way > through which I can achieve this in Ceph? If yes, what are the caveats > associated with that approach? We’ve been operating our Dumpling (now Firefly) cluster this way since it was put into production over 18-months ago, using host files to define all the Ceph hosts, works perfectly well. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 160 Thousand ceph-client.admin.*.asok files : Wired problem , never seen before
On 10/08/2015 12:02 AM, Robert LeBlanc wrote: > I'm guessing this is on an OpenStack node? There is a fix for this and I > think it will come out in the next release. For now we have had to disable > the admin sockets. Do you know what triggers the fault? we've not seen it on Firefly+RBD for Openstack. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-deply preflight hostname check?
I notice under HOSTNAME RESOLUTION section the use of 'host -4 {hostname}' as a required test, however, in all my trial deployments so far, none would pass as this command is a direct DNS query, and instead I usually just add entries to the host file. Two thoughts, is Ceph expecting to only do DNS queries? or instead would it be better for the pre-flight to use the getent hosts {hostname} as a test? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] CephFS test-case
I appreciate CephFS is not a high priority, but this is a user-experience test-case that can be a source of stability bugs for Ceph developers to investigate (and hopefully resolve): CephFS test-case 1. Create two clusters, each 3 nodes with 4 OSDs each 2. I used Ubuntu 13.04 followed by update/upgrade 3. Install Ceph version 0.61 on Cluster A 4. Install release on Cluster B with ceph-deploy 5. Fill Cluster A (version 0.61) with about one million files (all sizes) 6. rsync ClusterA ClusterB 7. In about 12-hours one or two OSDs on ClusterB will crash, restart OSDs, restart rsync 8. At around 75% full OSDs on ClusterB will become out of balance (some more full than others), one or more OSD will then crash. For (4) it is possible to use freely available .ISOs of old user-group CDROMs that are floating around the web, they are a good source of varied content size, directory size and filename lengths. My impression is that 0.61 was relatively stable but subsequent version such as 0.67.2 are less stable in this particular scenario with CephFS. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
On 06/09/2013, at 7:49 PM, "Bernhard Glomm" wrote: > Can I introduce the cluster network later on, after the cluster is deployed > and started working? > (by editing ceph.conf, push it to the cluster members and restart the > daemons?) Thanks Bernhard for asking this question, I have the same question. To rephrase, if we use ceph-deploy to setup a cluster, what is the recommended way to add the cluster/client networks later on? It seems that ceph-deploy provides a minimal ceph.conf, not explicitly defining OSDs, how is this file later re-populated with the missing detail? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] xfsprogs not found in RHEL
On Wed, Aug 28, 2013 at 4:46 PM, Stroppa Daniele (strp) wrote: > You might need the RHEL Scalable File System add-on. Exactly. I understand this needs to be purchased from Red Hat in order to get access to it if you are using the Red Hat subscription management system. I expect you could drag over the CentOS RPM but you would then need to track updates/patches yourself (or minimally reconcile differences between Red Hat and CentOS). In summary: XFS on Red Hat is a paid-for-option. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placement groups on a 216 OSD cluster with multiple pools
On 15/11/2013 8:57 AM, Dane Elwell wrote: [2] - I realise the dangers/stupidity of a replica size of 0, but some of the data we wish to store just isn’t /that/ important. We've been thinking of this too. The application is storing boot-images, ISOs, local repository mirrors etc where recovery is easy with a slight inconvenience if the data has to be re-fetched. This suggests a neat additional feature for Ceph would be the ability to have metadata attached to zero-replica objects that includes a URL where a copy could be recovered/re-fetched. Then it could all happen auto-magically. We also have users trampolining data between systems in order to buffer fast-data streams or handle data-surges. This can be zero-replica too. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] beware of jumbo frames
Spent a frustrating day trying to build a new test cluster, turned out I had jumbo frames set on the cluster-network only, but having re-wired the machines recently with a new switch, I forgot to check it could handle jumbo-frames (it can't). Symptoms were stuck/unclean PGs - a small subset of PGs would go active but always a proportion would not, got side-tracked by using a ruleset set to OSD (it worked once) but would not work with host - all red-herrings I think. Anyhow, somewhere deep in Ceph a check might be useful at the network layer for fragmentation (or just remember this message). Thanks to Jean-Charles Lopez (JCL) on IRC for walking me through diagnosis (and sticking with me) while I circled around and around... ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.87 Giant released
On 30/10/2014 8:56 AM, Sage Weil wrote: * *Degraded vs misplaced*: the Ceph health reports from 'ceph -s' and related commands now make a distinction between data that is degraded (there are fewer than the desired number of copies) and data that is misplaced (stored in the wrong location in the cluster). Is someone able to briefly described how/why misplaced happens please, is it repaired eventually? I've not seen misplaced (yet). leveldb_write_buffer_size = 32*1024*1024 = 33554432 // 32MB leveldb_cache_size= 512*1024*1204 = 536870912 // 512MB I noticed the typo, wondered about the code, but I'm not seeing the same values anyway? https://github.com/ceph/ceph/blob/giant/src/common/config_opts.h OPTION(leveldb_write_buffer_size, OPT_U64, 8 *1024*1024) // leveldb write buffer size OPTION(leveldb_cache_size, OPT_U64, 128 *1024*1024) // leveldb cache size ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.87 Giant released
On 30/10/2014 11:51 AM, Christian Balzer wrote: Thus objects are (temporarily) not where they're supposed to be, but still present in sufficient replication. thanks for the reminder, I suppose that is obvious :-) A much more benign scenario than degraded and I hope that this doesn't even generate a WARN in the "ceph -s" report. Better described as a transitory "hazardous" state, given that the PG distribution might not be optimal for a period of time and (inopportune) failures may tip the health into degraded. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Compile from source with Kinetic support
On Sat, Nov 29, 2014 at 5:19 AM, Julien Lutran wrote: > Where can I find this kinetic devel package ? I guess you want this (C== kinetic client)? it has kinetic.h at least. https://github.com/Seagate/kinetic-cpp-client ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
On Sat, Dec 6, 2014 at 4:36 AM, Sage Weil wrote: > - enumerate experiemntal options we want to enable >... > This has the property that no config change is necessary when the > feature drops its experimental status. It keeps the risky options in one place too so easier to spot. > In all of these cases, we can also make a point of sending something to > the log on daemon startup. I don't think too many people will notice > this, but it is better than nothing. Perhaps change the cluster health status to FRAGILE? or AT_RISK? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] replacing an OSD or crush map sensitivity
Could I have a critique of this approach please as to how I could have done it better or whether what I experienced simply reflects work still to be done. This is with Ceph 0.61.2 on a quite slow test cluster (logs shared with OSDs, no separate journals, using CephFS). I knocked the power cord out from a storage node taking down 4 of the hosted OSDs, all but one came back ok. This is one OSD out of a total of 12 so 1/12 of the storage. Losing an OSD put the cluster into recovery, so all good. Next action was how to get the missing (downed) OSD back online. The OSD was xfs based and so I had to throw away the xfs log to get it to mount. Having done this and getting it re-mounted Ceph then started throwing issue #4855 (I added dmesg and logs to that issue if it helps - I am wonder if throwing away the xfs log caused an internal OSD inconsistency? and this causes issue #4855?). Given that I could not "recover" this OSD as far as Ceph is concerned I decided to delete and rebuild it. Several hours later, cluster was back to HEALTH_OK. I proceeded to remove and re-add the bad OSD. I following the doc suggestions to do this. The problem is we each change, it caused a slight change in the crush map, resulting in the cluster going back into recovery, adding several hours wait for each change. I chose to wait until the cluster was back to HEALTH_OK before doing the next step. Overall it has taken a few days to finally get a single OSD back into the cluster. At one point during recovery the full threshold was triggered on a single OSD causing the recovery to stop, doing "ceph pg set_full_ratio 0.98" did not help. I was not planning to add data to the cluster while doing recovery operations and did not understand the suggestion the PGs could be deleted to make space on a "full" OSD, so I expect raising the threshold was the best option but it had no (immediate) effect. I am now back to having all 12 OSDs in and the hopefully final recovery under way while it re-balances the OSDs, although I note I am still getting the full OSD warning I am expecting this to disappear soon now that the 12th OSD is back online. During this recovery the percentage degraded has been a little confusing. While the 12th OSD was offline the percentages were around 15-20% IIRC. But now I see the percentage is 35% and slowly dropping, not sure I understand the ratios and why so high with a single missing OSD. A few documentation errors caused confusion too. This page still contains errors in the steps to create a new OSD (manually): http://eu.ceph.com/docs/wip-3060/cluster-ops/add-or-rm-osds/#adding-an-osd-manual "ceph osd create {osd-num}" should be "ceph osd create" and this: http://eu.ceph.com/docs/wip-3060/cluster-ops/crush-map/#addosd I had to put host= to get the command accepted. Suggestions and questions: 1. Is there a way to get documentation pages fixed? or at least health-warnings on them: "This page badly needs updating since it is wrong/misleading" 2. We need a small set of definitive succinct recipes that provide steps to recover from common failures with a narrative around what to expect at each step (your cluster will be in recovery here...). 3. Some commands are throwing erroneous errors that are actually benign :ceph-osd -i 10 --mkfs --mkkey" complains about failures that are expected as the OSD is initially empty. 4. An easier way to capture the state of the cluster for analysis. I don't feel confident that when asked for "logs" that I am giving the most useful snippets or the complete story. It seems we need a tool that can gather all this in a neat bundle for later dissection or forensics. 5. Is there a more straightforward (faster) way getting an OSD back online. It almost seems like it is worth having a standby OSD ready to step in and assume duties (a hot spare?). 6. Is there a way to make the crush map less sensitive to changes during recovery operations? I would have liked to stall/slow recovery while I replaced the OSD then let it run at full speed. Excuses: I'd be happy to action suggestions but my current level of Ceph understanding is still too limited that effort on my part is unproductive; I am prodding the community to see if there is consensus on the need. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] replacing an OSD or crush map sensitivity
On 4/06/2013 9:16 AM, Chen, Xiaoxi wrote: > my 0.02, you really dont need to wait for health_ok between your > recovery steps,just go ahead. Everytime a new map be generated and > broadcasted,the old map and in-progress recovery will be canceled thanks Xiaoxi, that is helpful to know. It seems to me that there might be a failure-mode (or race-condition?) here though, as the cluster is now struggling to recover as the replacement OSD caused the cluster to go into backfill_toofull. The failure sequence might be: 1. From HEALTH_OK crash an OSD 2. Wait for recovery 3. Remove OSD using usual procedures 4. Wait for recovery 5. Add back OSD using usual procedures 6. Wait for recovery 7. Cluster is unable to recover due to toofull conditions Perhaps this is a needed test case to round-trip a cluster through a known failure/recovery scenario. Note this is using a simplistically configured test-cluster with CephFS in the mix and about 2.5 million files. Something else I noticed: I restarted the cluster (and set the leveldb compact option since I'd run out of space on the roots) and now I see it is again making progress on the backfill. Seems odd that the cluster pauses but a restart clears the pause, is that by design? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] replacing an OSD or crush map sensitivity
On Tue, Jun 4, 2013 at 1:59 PM, Sage Weil wrote: > On Tue, 4 Jun 2013, Nigel Williams wrote: >> Something else I noticed: ... > > Does the monitor data directory share a disk with an OSD? If so, that > makes sense: compaction freed enough space to drop below the threshold... Of course! that is exactly it, thanks - scratch that last observation, red herring. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Drive replacement procedure
On 25/06/2013 5:59 AM, Brian Candler wrote: On 24/06/2013 20:27, Dave Spano wrote: Here's my procedure for manually adding OSDs. The other thing I discovered is not to wait between steps; some changes result in a new crushmap, that then triggers replication. You want to speed through the steps so the cluster does not waste time moving objects around to meet the replica requirements until you have finished crushmap changes. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Luminous 12.1.3: mgr errors
Cluster is ok and mgr is active, but unable to get the dashboard to start. I see the following errors in logs: 2017-08-12 15:40:07.805991 7f508effd500 0 pidfile_write: ignore empty --pid-file 2017-08-12 15:40:07.810124 7f508effd500 -1 auth: unable to find a keyring on /var/lib/ceph/mgr/ceph-0/keyring: (2) No such file or directory 2017-08-12 15:40:07.810145 7f508effd500 -1 monclient: ERROR: missing keyring, cannot use cephx for authentication and an unrelated error I think: RuntimeError: no certificate configured raise RuntimeError('no certificate configured') File "/usr/lib/ceph/mgr/restful/module.py", line 299, in _serve self._serve() File "/usr/lib/ceph/mgr/restful/module.py", line 248, in serve Setup was done by: added [mgr] section to ceph.conf, then: ceph config-key put mgr/dashboard/server_addr :: systemctl restart ceph-mgr@0 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous 12.1.3: mgr errors
On 12 August 2017 at 23:04, David Turner wrote: > I haven't set up the mgr service yet, but your daemon folder is missing > it's keyring file (/var/lib/ceph/mgr/ceph-0/keyring). It's exactly what > the error message says. When you set it up did you run a command pile ceph > auth add? If you did, then you just need to ask the cluster what the auth > key is and put it into that keyring file. You can look at the keyring for a > month to see what format it's expecting. > ceph auth list shows: mgr.c0mds-100 key: AQDVXI1ZV1U0KRAAFDY6/ZCVzTjxhy0d5/ReSA== caps: [mds] allow * caps: [mon] allow profile mgr caps: [osd] allow * and there is this: root@c0mds-100:/var/lib/ceph/mgr/ceph-c0mds-100# ls -l total 4 -rw-r--r-- 1 root root 0 Aug 11 17:29 done -rw-r--r-- 1 root root 64 Aug 11 17:29 keyring -rw-r--r-- 1 root root 0 Aug 11 17:29 systemd root@c0mds-100:/var/lib/ceph/mgr/ceph-c0mds-100# cat keyring [mgr.c0mds-100] key = AQDVXI1ZV1U0KRAAFDY6/ZCVzTjxhy0d5/ReSA== root@c0mds-100:/var/lib/ceph/mgr/ceph-c0mds-100# NOTE: this mgr is running on the MDS host. Everything was done with ceph-deploy. but mgr is looking for the keyring under /var/lib/ceph/mgr/ceph-0? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] State of play for RDMA on Luminous
On 29 August 2017 at 00:21, Haomai Wang wrote: > On Wed, Aug 23, 2017 at 1:26 AM, Florian Haas wrote: >> - And more broadly, if a user wants to use the performance benefits of >> RDMA, but not all of their potential Ceph clients have InfiniBand HCAs, >> what are their options? RoCE? > > roce v2 is supported I've no experience with RoCE, but given Florian's question, is the implication that Infiniband RDMA and RoCE can be bridged somehow? otherwise how do mix clients with different transports access the same Ceph cluster? I'm guessing IPoIB clients could work with a RoCE Ceph cluster via an Ethernet/Infiniband gateway (like the Mellanox product), but the IPoIB clients could not do RDMA as this won't cross the gateway(at least I understand this is the case with the Mellanox product). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v12.2.0 Luminous released
On 30 August 2017 at 16:05, Mark Kirkwood wrote: > Very nice! > > I tested an upgrade from Jewel, pretty painless. However we forgot to merge: > > http://tracker.ceph.com/issues/20950 > > So the mgr creation requires surgery still :-( > > regards > > Mark > > > > On 30/08/17 06:20, Abhishek Lekshmanan wrote: >> >> We're glad to announce the first release of Luminous v12.2.x long term >> stable release series. There have been major changes since Kraken >> (v11.2.z) and Jewel (v10.2.z), and the upgrade process is non-trivial. >> Please read the release notes carefully. >> >> For more details, links & changelog please refer to the >> complete release notes entry at the Ceph blog: >> http://ceph.com/releases/v12-2-0-luminous-released/ >> >> >> Major Changes from Kraken >> - >> >> - *General*: >>* Ceph now has a simple, built-in web-based dashboard for monitoring >> cluster >> status. >> >> - *RADOS*: >>* *BlueStore*: >> - The new *BlueStore* backend for *ceph-osd* is now stable and the >>new default for newly created OSDs. BlueStore manages data >>stored by each OSD by directly managing the physical HDDs or >>SSDs without the use of an intervening file system like XFS. >>This provides greater performance and features. >> - BlueStore supports full data and metadata checksums >>of all data stored by Ceph. >> - BlueStore supports inline compression using zlib, snappy, or LZ4. >> (Ceph >>also supports zstd for RGW compression but zstd is not recommended >> for >>BlueStore for performance reasons.) >> >>* *Erasure coded* pools now have full support for overwrites >> allowing them to be used with RBD and CephFS. >> >>* *ceph-mgr*: >> - There is a new daemon, *ceph-mgr*, which is a required part of >>any Ceph deployment. Although IO can continue when *ceph-mgr* >>is down, metrics will not refresh and some metrics-related calls >>(e.g., `ceph df`) may block. We recommend deploying several >>instances of *ceph-mgr* for reliability. See the notes on >>Upgrading below. >> - The *ceph-mgr* daemon includes a REST-based management API. >>The API is still experimental and somewhat limited but >>will form the basis for API-based management of Ceph going forward. >> - ceph-mgr also includes a Prometheus exporter plugin, which can >> provide Ceph >>perfcounters to Prometheus. >> - ceph-mgr now has a Zabbix plugin. Using zabbix_sender it sends >> trapper >>events to a Zabbix server containing high-level information of the >> Ceph >>cluster. This makes it easy to monitor a Ceph cluster's status and >> send >>out notifications in case of a malfunction. >> >>* The overall *scalability* of the cluster has improved. We have >> successfully tested clusters with up to 10,000 OSDs. >>* Each OSD can now have a device class associated with >> it (e.g., `hdd` or `ssd`), allowing CRUSH rules to trivially map >> data to a subset of devices in the system. Manually writing CRUSH >> rules or manual editing of the CRUSH is normally not required. >>* There is a new upmap exception mechanism that allows individual PGs >> to be moved around to achieve >> a *perfect distribution* (this requires luminous clients). >>* Each OSD now adjusts its default configuration based on whether the >> backing device is an HDD or SSD. Manual tuning generally not >> required. >>* The prototype mClock QoS queueing algorithm is now available. >>* There is now a *backoff* mechanism that prevents OSDs from being >> overloaded by requests to objects or PGs that are not currently able >> to >> process IO. >>* There is a simplified OSD replacement process that is more robust. >>* You can query the supported features and (apparent) releases of >> all connected daemons and clients with `ceph features` >>* You can configure the oldest Ceph client version you wish to allow to >> connect to the cluster via `ceph osd set-require-min-compat-client` >> and >> Ceph will prevent you from enabling features that will break >> compatibility >> with those clients. >>* Several `sleep` settings, include `osd_recovery_sleep`, >> `osd_snap_trim_sleep`, and `osd_scrub_sleep` have been >> reimplemented to work efficiently. (These are used in some cases >> to work around issues throttling background work.) >>* Pools are now expected to be associated with the application using >> them. >> Upon completing the upgrade to Luminous, the cluster will attempt to >> associate >> existing pools to known applications (i.e. CephFS, RBD, and RGW). >> In-use pools >> that are not associated to an application will generate a health >> warning. Any >> unassociated pools can be manually associated using the new >> `ceph osd pool applica
Re: [ceph-users] v12.2.0 Luminous released
> On 30 August 2017 at 16:05, Mark Kirkwood > wrote: >> http://tracker.ceph.com/issues/20950 >> >> So the mgr creation requires surgery still :-( is there a way out of this error with ceph-mgr? mgr init Authentication failed, did you specify a mgr ID with a valid keyring? root@c0mds-100:~# systemctl status ceph-mgr@c0mds-100 ● ceph-mgr@c0mds-100.service - Ceph cluster manager daemon Loaded: loaded (/lib/systemd/system/ceph-mgr@.service; enabled; vendor preset: enabled) Active: inactive (dead) (Result: exit-code) since Wed 2017-08-30 16:40:43 AEST; 3min 36s ago Process: 1821 ExecStart=/usr/bin/ceph-mgr -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=255) Main PID: 1821 (code=exited, status=255) as reported previously, pre 12.2.0 versions seemed to create erroneous ceph-mgr with the wrong host identifier (in my case /var/lib/ceph/mgr/ceph-0 and ceph-c0mds/) sorry for the previous empty email...keyboard stuck... ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v12.2.0 Luminous released
On 30 August 2017 at 17:43, Mark Kirkwood wrote: > Yes - you just edit /var/lib/ceph/bootstrap-mgr/ceph.keyring so the key > matches what 'ceph auth list' shows and re-deploy the mgr (worked for me in > 12.1.3/4 and 12.2.0). thanks for the tip, what I did to get it work: - had already sync'd the keyrings - redid ceph-deploy --overwrite-conf mgr create c0mds-100 - ceph mgr module enable dashboard I wasn't expecting the last item since I had added the [mgr] section to /etc/ceph/ceph.conf... anyhow, working now ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Centos7, luminous, cephfs, .snaps
On 30 August 2017 at 18:52, Marc Roos wrote: > I noticed it is .snap not .snaps Yes > mkdir: cannot create directory ‘.snap/snap1’: Operation not permitted > > Is this because my permissions are insufficient on the client id? fairly sure you've forgotten this step: ceph mds set allow_new_snaps true --yes-i-really-mean-it ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v12.2.0 Luminous released
On 30 August 2017 at 20:53, John Spray wrote: > The mgr_initial_modules setting is only applied at the point of > cluster creation, ok. > so I would guess that if it didn't seem to take > effect then this was an upgrade from >=11.x not quite, it was a clean install of Luminous, and somewhere around 12.1.3, ceph-deploy got confused about the service name, it created both ceph-0 and a c0mds-100 entries under /var/lib/ceph and messed up the keys. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Bluestore disk colocation using NVRAM, SSD and SATA
On 21 September 2017 at 04:53, Maximiliano Venesio wrote: > Hi guys i'm reading different documents about bluestore, and it never > recommends to use NVRAM to store the bluefs db, nevertheless the official > documentation says that, is better to use the faster device to put the > block.db in. > Likely not mentioned since no one yet has had the opportunity to test it. So how do i have to deploy using bluestore, regarding where i should put > block.wal and block.db ? > block.* would be best on your NVRAM device, like this: ceph-deploy osd create --bluestore c0osd-136:/dev/sda --block-wal /dev/nvme0n1 --block-db /dev/nvme0n1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Bluestore OSD_DATA, WAL & DB
On 26 September 2017 at 01:10, David Turner wrote: > If they are on separate > devices, then you need to make it as big as you need to to ensure that it > won't spill over (or if it does that you're ok with the degraded performance > while the db partition is full). I haven't come across an equation to judge > what size should be used for either partition yet. Is it the case that only the WAL will spill if there is a backlog clearing entries into the DB partition? so the WAL's fill-mark oscillates but the DB is going to steadily grow (depending on the previously mentioned factors of "...extents, checksums, RGW bucket indices, and potentially other random stuff". Is there an indicator that can be monitored to show that a spill is occurring? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Bluestore OSD_DATA, WAL & DB
On 26 September 2017 at 08:11, Mark Nelson wrote: > The WAL should never grow larger than the size of the buffers you've > specified. It's the DB that can grow and is difficult to estimate both > because different workloads will cause different numbers of extents and > objects, but also because rocksdb itself causes a certain amount of > space-amplification due to a variety of factors. Ok, I was confused whether both types could spill. within Bluestore it simply blocks if the WAL hits 100%? Would a drastic (quick) action to correct a too-small-DB-partition (impacting performance) is to destroy the OSD and rebuild it with a larger DB partition? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] clients failing to advance oldest client/flush tid
On 9 October 2017 at 19:21, Jake Grimmett wrote: > HEALTH_WARN 9 clients failing to advance oldest client/flush tid; > 1 MDSs report slow requests; 1 MDSs behind on trimming On a proof-of-concept 12.2.1 cluster (few random files added, 30 OSDs, default Ceph settings) I can get the above error by doing this from a client: bonnie++ -s 0 -n 1000 -u 0 This makes 1 million files in a single directory (we wanted to see what might break). This takes a few hours to run but seems to finish without incident. Over that time we get this in the logs: root@c0mon-101:/var/log/ceph# zcat ceph-mon.c0mon-101.log.6.gz|fgrep MDS_TRIM 2017-10-04 11:14:18.489943 7ff914a26700 0 log_channel(cluster) log [WRN] : Health check failed: 1 MDSs behind on trimming (MDS_TRIM) 2017-10-04 11:14:22.523117 7ff914a26700 0 log_channel(cluster) log [INF] : Health check cleared: MDS_TRIM (was: 1 MDSs behind on trimming) 2017-10-04 11:14:26.589797 7ff914a26700 0 log_channel(cluster) log [WRN] : Health check failed: 1 MDSs behind on trimming (MDS_TRIM) 2017-10-04 11:14:34.614567 7ff914a26700 0 log_channel(cluster) log [INF] : Health check cleared: MDS_TRIM (was: 1 MDSs behind on trimming) 2017-10-04 20:38:22.812032 7ff914a26700 0 log_channel(cluster) log [WRN] : Health check failed: 1 MDSs behind on trimming (MDS_TRIM) 2017-10-04 20:41:14.700521 7ff914a26700 0 log_channel(cluster) log [INF] : Health check cleared: MDS_TRIM (was: 1 MDSs behind on trimming) root@c0mon-101:/var/log/ceph# ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Bluestore OSD_DATA, WAL & DB
On 3 November 2017 at 07:45, Martin Overgaard Hansen wrote: > I want to bring this subject back in the light and hope someone can provide > insight regarding the issue, thanks. Thanks Martin, I was going to do the same. Is it possible to make the DB partition (on the fastest device) too big? in other words is there a point where for a given set of OSDs (number + size) the DB partition is sized too large and is wasting resources. I recall a comment by someone proposing to split up a single large (fast) SSD into 100GB partitions for each OSD. The answer could be couched as some intersection of pool type (RBD / RADOS / CephFS), object change(update?) intensity, size of OSD etc and rule-of-thumb. An idea occurred to me that by monitoring for the logged spill message (the event when the DB partition spills/overflows to the OSD), OSDs could be (lazily) destroyed and recreated with a new DB partition increased in size say by 10% each time. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] how to improve performance
On 20 November 2017 at 23:36, Christian Balzer wrote: > On Mon, 20 Nov 2017 14:02:30 +0200 Rudi Ahlers wrote: >> The SATA drives are ST8000NM0055-1RM112 >> > Note that these (while fast) have an internal flash cache, limiting them to > something like 0.2 DWPD. > Probably not an issue with the WAL/DB on the Intels, but something to keep > in mind. I had forgotten about the flash-cache hybrid drives. Seagate calls them SSHD (Solid State Hard Drives) and as Christian highlights they have several GB of SSD as an on-board cache. I looked at the specifications for the ST8000NM0055 but I cannot see them listed as SSHD, rather they seem like the usual Seagate Enterprise hard-drive. https://www.seagate.com/www-content/product-content/enterprise-hdd-fam/enterprise-capacity-3-5-hdd/constellation-es-4/en-us/docs/ent-capacity-3-5-hdd-8tb-ds1863-2-1510us.pdf Is there something in the specifications that gives them away as SSHD? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] how to improve performance
On 21 November 2017 at 10:07, Christian Balzer wrote: > On Tue, 21 Nov 2017 10:00:28 +1100 Nigel Williams wrote: >> Is there something in the specifications that gives them away as SSHD? >> > The 550TB endurance per year for an 8TB drive and the claim of 30% faster > IOPS would be a dead giveaway, one thinks. I just found this other answer: http://products.wdc.com/library/other/2579-772003.pdf Hard-drive manufacturers introduced workload specifications because they better model failure rates than MTTF. I see the drive has 2MB of NOR-flash for write-caching, what happens when this wears out? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Transparent huge pages
Given that memory is a key resource for Ceph, this advice about switching Transparent Huge Pages kernel setting to madvise would be worth testing to see if THP is helping or hindering. Article: https://blog.nelhage.com/post/transparent-hugepages/ Discussion: https://news.ycombinator.com/item?id=15795337 echo madvise | sudo tee /sys/kernel/mm/transparent_hugepage/enabled ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] CephFS - Mounting a second Ceph file system
On 29 November 2017 at 01:51, Daniel Baumann wrote: > On 11/28/17 15:09, Geoffrey Rhodes wrote: >> I'd like to run more than one Ceph file system in the same cluster. Are their opinions on how stable multiple filesystems per single Ceph cluster is in practice? is anyone using it actively with a stressful load? I see the docs still place it under Experimental: http://docs.ceph.com/docs/master/cephfs/experimental-features/#multiple-filesystems-within-a-ceph-cluster ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] BlueStore upgrade steps broken
On 18 August 2018 at 03:06, David Turner wrote: > The WAL will choose the fastest device available. > Any idea how it makes this determination automatically? is it doing a hdparm -t or similar? is fastest=bandwidth, IOPs or latency? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph 12.2.4 - which OSD has slow requests ?
On 18 April 2018 at 05:52, Steven Vacaroaia wrote: > I can see many slow requests in the logs but no clue which OSD is the > culprit > How can I find the culprit ? > ceph osd perf or ceph pg dump osds -f json-pretty | jq .[].fs_perf_stat searching the ML archives for threads about slow requests will surface several techniques to explore. slow requests site:http://lists.ceph.com/pipermail/ceph-users-ceph.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] network connectivity test tool?
I thought I had book-marked a neat shell script that used the Ceph.conf definitions to do an all-to-all, all-to-one check of network connectivity for a Ceph cluster (useful for discovering problems with jumbo frames), but I've lost the bookmark and after trawling github and trying various keywords cannot find it. I thought the tool was in Ceph CBT or was a CERN-developed script, but neither yielded a hit. Anyone know where it is? thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] State of Ceph documention
On Fri, Feb 26, 2016 at 3:10 PM, Christian Balzer wrote: > Then we come to a typical problem for fast evolving SW like Ceph, things > that are not present in older versions. I was going to post on this too (I had similar frustrations), and would like to propose that a move to splitting the documentation by versions: OLD http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ NEW http://docs.ceph.com/docs/master/hammer/rados/operations/cache-tiering/ http://docs.ceph.com/docs/master/infernalis/rados/operations/cache-tiering/ http://docs.ceph.com/docs/master/jewel/rados/operations/cache-tiering/ and so on. When a new version is started, the documentation should be 100% cloned and the tree restructured around the version. It could equally be a drop-down on the page to select the version. Postgres for example uses a similar mechanism: http://www.postgresql.org/docs/ Note the version numbers are embedded in the URL. I like their commenting mechanism too as it provides a running narrative of changes that should be considered as practice develops around things to do or avoid. Once the documentation is cloned for the new version, all the inapplicable material should be removed and the new features/practice changes should be added. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] State of Ceph documention
On Fri, Feb 26, 2016 at 4:09 PM, Adam Tygart wrote: > The docs are already split by version, although it doesn't help that > it isn't linked in an obvious manner. > > http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ Is there any reason to keep this "master" (version-less variant) given how much confusion it causes? I think I noticed the version split one time back but it didn't lodge in my mind, and when I looked for something today I hit the "master" and there were no hits for the version (which I should have been looking at). I'd be glad to contribute to the documentation effort. For example I would like to be able to ask questions around the terminology that is scattered through the documentation that I think needs better explanation. I'm not sure if pull-requests that try to annotate what is there would mean some parts would become a wall of text whereas the explanation would be better suited as a (more informal) comment-thread at the bottom of the page that can be browsed (mainly by beginners trying to navigate an unfamiliar architecture). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] State of Ceph documention
On Fri, Feb 26, 2016 at 11:28 PM, John Spray wrote: > Some projects have big angry warning banners at the top of their > master branch documentation, I think perhaps we should do that too, > and at the same time try to find a way to steer google hits to the > latest stable branch docs rather than to master. Are there reasons to "publish" the version-less master? Maybe I've missed the explanation for why master is necessary, but could it be completely hidden? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] State of Ceph documention
On Sat, Feb 27, 2016 at 12:08 AM, Andy Allan wrote: > When I made a (trivial, to be fair) documentation PR it was dealt with > immediately, both when I opened it, and when I fixed up my commit > message. I'd recommend that if anyone sees anything wrong with the > docs, just submit a PR with the fix. Are we collectively ok with the discussion about the documentation to happen via the repo (presumably on github)? The limitation with PRs is the submitter has to suggest a change when sometimes it is a less-formal interpretation question. Or will it be ok to conduct the discussions on this mailing list to form up the ultimate PR? I'm reluctant to suggest a ceph-docs mailing list but that would be another option if we can't have commentary on the documentation web-pages. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] BlueFS spillover detected - 14.2.1
On Thu, 20 Jun 2019 at 09:12, Vitaliy Filippov wrote: > All values except 4, 30 and 286 GB are currently useless in ceph with > default rocksdb settings :) > however, several commenters have said that during compaction rocksdb needs space during the process, and hence the DB partition needs to be twice those sizes, so 8GB, 60GB and 600GB. Does rocksdb spill during compaction if it doesn't have enough space? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] show-prediction-config - no valid command found?
Have I missed a step? Diskprediction module is not working for me. root@cnx-11:/var/log/ceph# ceph device show-prediction-config no valid command found; 10 closest matches: root@cnx-11:/var/log/ceph# ceph mgr module ls { "enabled_modules": [ "dashboard", "diskprediction_cloud", "iostat", "pg_autoscaler", "prometheus", "restful" ],... root@cnx-11:/var/log/ceph# ceph -v ceph version 14.2.1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable) One other failure I get for: ceph device get-health-metrics INTEL_SSDPE2KE020T7_BTLE74200D8J2P0DGN ... "nvme_vendor": "intel", "dev": "/dev/nvme0n1", "error": "smartctl returned invalid JSON" ... with smartmon 7.1 Using this version directly with the device and with JSON output parses ok (using an online parser). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Nautilus - cephfs auth caps problem?
I am getting "Operation not permitted" on a write when trying to set caps for a user. Admin user (allow * for everything) works ok. This does not work: caps: [mds] allow r,allow rw path=/home caps: [mon] allow r caps: [osd] allow rwx tag cephfs data=cephfs_data2 This does work: caps: [mds] allow r,allow rw path=/home caps: [mon] allow r caps: [osd] allow * Nothing specific I set for the OSD caps, allows files to be written, although I can create files and directories. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Nautilus - cephfs auth caps problem?
thanks for the tip, I did wonder about that, and checked that at one point, and assumed that was ok. root@cnx-11:~# ceph osd pool application get cephfs_data { "cephfs": { "data": "cephfs" } } root@cnx-11:~# ceph osd pool application get cephfs_data2 { "cephfs": { "data": "cephfs" } } root@cnx-11:~# ceph osd pool application get cephfs_metadata { "cephfs": { "metadata": "cephfs" } } root@cnx-11:~# Is the act of setting it again likely to make a needed change elsewhere that is fixed by that git pull? On Wed, 3 Jul 2019 at 17:20, Paul Emmerich wrote: > Your cephfs was probably created with a buggy version that didn't set the > metadata tags on the data pools correctly. IIRC there still isn't any > automated migration of old broken pools. > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Nautilus 14.2.1 / 14.2.2 crash
On Sat, 20 Jul 2019 at 04:28, Nathan Fish wrote: > On further investigation, it seems to be this bug: > http://tracker.ceph.com/issues/38724 We just upgraded to 14.2.2, and had a dozen OSDs at 14.2.2 go down this bug, recovered with: systemctl reset-failed ceph-osd@160 systemctl start ceph-osd@160 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] fixing a bad PG per OSD decision with pg-autoscaling?
Due to a gross miscalculation several years ago I set way too many PGs for our original Hammer cluster. We've lived with it ever since, but now we are on Luminous, changes result in stuck-requests and balancing problems. The cluster currently has 12% misplaced, and is grinding to re-balance but is unusable to clients (even with osd_max_pg_per_osd_hard_ratio set to 32, and mon_max_pg_per_osd set to 1000). Can I safely press on upgrading to Nautilus in this state so I can enable the pg-autoscaling to finally fix the problem? thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] cephfs 1 large omap objects
Out of the blue this popped up (on an otherwise healthy cluster): HEALTH_WARN 1 large omap objects LARGE_OMAP_OBJECTS 1 large omap objects 1 large objects found in pool 'cephfs_metadata' Search the cluster log for 'Large omap object found' for more details. "Search the cluster log" is somewhat opaque, there are logs for many daemons, what is a "cluster" log? In the ML history some found it in the OSD logs? Another post suggested removing lost+found, but using cephfs-shell I don't see one at the top-level, is there another way to disable this "feature"? thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs 1 large omap objects
I followed some other suggested steps, and have this: root@cnx-17:/var/log/ceph# zcat ceph-osd.178.log.?.gz|fgrep Large 2019-10-02 13:28:39.412 7f482ab1c700 0 log_channel(cluster) log [WRN] : Large omap object found. Object: 2:654134d2:::mds0_openfiles.0:head Key count: 306331 Size (bytes): 13993148 root@cnx-17:/var/log/ceph# ceph daemon osd.178 config show | grep osd_deep_scrub_large_omap "osd_deep_scrub_large_omap_object_key_threshold": "20", "osd_deep_scrub_large_omap_object_value_sum_threshold": "1073741824", root@cnx-11:~# rados -p cephfs_metadata stat 'mds0_openfiles.0' cephfs_metadata/mds0_openfiles.0 mtime 2019-10-06 23:37:23.00, size 0 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs 1 large omap objects
I've adjusted the threshold: ceph config set osd osd_deep_scrub_large_omap_object_key_threshold 35 Colleague suggested that this will take effect on the next deep-scrub. Is the default of 200,000 too small? will this be adjusted in future releases or is it meant to be adjusted in some use-cases? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Issues with Nautilus 14.2.6 ceph-volume lvm batch --bluestore ?
On Mon, 20 Jan 2020 at 14:15, Dave Hall wrote: > BTW, I did try to search the list archives via > http://lists.ceph.com/pipermail/ceph-users-ceph.com/, but that didn't work > well for me. Is there another way to search? With your favorite search engine (say Goog / ddg ), you can do this: ceph site:http://lists.ceph.com/pipermail/ceph-users-ceph.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com