Re: [ceph-users] Can't create erasure coded pools with k+m greater than hosts?

2019-10-18 Thread Chris Taylor
/luminous/rados/operations/crush-map-edits/ This link is for creating the EC rules for 4+2 with only 3 hosts: https://ceph.io/planet/erasure-code-on-small-clusters/ I hope that helps! Chris On 2019-10-18 2:55 pm, Salsa wrote: Ok, I'm lost here. How am I supposed to write a crush rule

[ceph-users] bluestore db & wal use spdk device how to ?

2019-08-05 Thread Chris Hsiang
Hi All, I have multiple nvme ssd and I wish to use two of them for spdk as bluestore db & wal my assumption would be in ceph.conf under osd.conf put following bluestore_block_db_path = "spdk::01:00.0"bluestore_block_db_size = 40 * 1024 * 1024 * 1024 (40G) Then how to prepare osd? ceph-volu

Re: [ceph-users] Changing the release cadence

2019-06-05 Thread Chris Taylor
It seems like since the change to the 9 months cadence it has been bumpy for the Debian based installs. Changing to a 12 month cadence sounds like a good idea. Perhaps some Debian maintainers can suggest a good month for them to get the packages in time for their release cycle. On 2019-06-0

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-27 Thread Chris
the data movement back and forth. And if you see that recovering the node will take a long time, just manually set things out for the time being. Christian On Sun, 27 Jan 2019 00:02:54 +0100 Götz Reinicke wrote: Dear Chris, Thanks for your feedback. The node/OSDs in question are part of an

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-26 Thread Chris
It sort of depends on your workload/use case. Recovery operations can be computationally expensive. If your load is light because its the weekend you should be able to turn that host back on as soon as you resolve whatever the issue is with minimal impact. You can also increase the priority

[ceph-users] Garbage collection growing and db_compaction with small file uploads

2019-01-09 Thread Chris Sarginson
is manually and use the rados rm command to remove the objects from the .rgw.buckets pool after having a look through some historic posts on this list, and then remove the garbage collection objects - is this a reasonable solution? Are there any recommendations for dealing with a garbage col

Re: [ceph-users] Help Ceph Cluster Down

2019-01-03 Thread Chris
If you added OSDs and then deleted them repeatedly without waiting for replication to finish as the cluster attempted to re-balance across them, its highly likely that you are permanently missing PGs (especially if the disks were zapped each time). If those 3 down OSDs can be revived there is

[ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

2018-11-20 Thread Chris Martin
ceph-mds: ceph-mds depends on ceph-base (= 13.2.1-1xenial); however: Version of ceph-base on system is 13.2.2-1xenial. ``` I don't think I want to downgrade ceph-base to 13.2.1. Thank you, Chris Martin > Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and > marking

Re: [ceph-users] slow ops after cephfs snapshot removal

2018-11-09 Thread Chris Taylor
> On Nov 9, 2018, at 1:38 PM, Gregory Farnum wrote: > >> On Fri, Nov 9, 2018 at 2:24 AM Kenneth Waegeman >> wrote: >> Hi all, >> >> On Mimic 13.2.1, we are seeing blocked ops on cephfs after removing some >> snapshots: >> >> [root@osd001 ~]# ceph -s >>cluster: >> id: 92bfcf0

Re: [ceph-users] Resolving Large omap objects in RGW index pool

2018-10-18 Thread Chris Sarginson
10 15:06:38.940464Z", This obviously still leaves me with the original issue noticed, which is multiple instances of buckets that seem to have been repeatedly resharded to the same number of shards as the currently active index. From having a search around the tracker it seems like this may be worth

Re: [ceph-users] Resolving Large omap objects in RGW index pool

2018-10-16 Thread Chris Sarginson
m/issues/24603 Should I be OK to loop through these indexes and remove any with a reshard_status of 2, a new_bucket_instance_id that does not match the bucket_instance_id returned by the command: radosgw-admin bucket stats --bucket ${bucket} I'd ideally like to get to a point where I can turn

Re: [ceph-users] Resolving Large omap objects in RGW index pool

2018-10-04 Thread Chris Sarginson
these other buckets, they are exhibiting the same sort of symptoms as the first (multiple instances of radosgw-admin metadata get showing what seem to be multiple resharding processes being run, with different mtimes recorded). Thanks Chris On Thu, 4 Oct 2018 at 16:21 Konstantin Shalygin wrote:

[ceph-users] Resolving Large omap objects in RGW index pool

2018-10-04 Thread Chris Sarginson
e output is available to view here: https://pastebin.com/g1TJfKLU It would be useful if anyone can offer some clarification on how to proceed from this situation, identifying and removing any old/stale indexes from the index pool (if that is the case), as I've not been able to spot anything in the archives. If there's any further information that is needed for additional context please let me know. Thanks Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Intermittent slow/blocked requests on one node

2018-08-22 Thread Chris Martin
it appear to the client that the write has completed? Thank you! Information about my cluster and example warning messages follow. Chris Martin About my cluster: Luminous (12.2.4), 5 nodes, each with 12 OSDs (one rotary HDD per OSD), and a shared SSD in each node with 24 partitions for all the Rock

Re: [ceph-users] ceph plugin balancer error

2018-07-05 Thread Chris Hsiang
s written for python 2.7... this might be related https://github.com/ceph/ceph/pull/21446 this might be opensuse building ceph package issue chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph plugin balancer error

2018-07-05 Thread Chris Hsiang
y default python env is 2.7... so under dict object should have iteritems method.... Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] ceph plugin balancer error

2018-07-05 Thread Chris Hsiang
'dict' object has no attribute 'iteritems' what config need to be done in order to get it work? Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Frequent slow requests

2018-06-19 Thread Chris Taylor
On 2018-06-19 12:17 pm, Frank de Bot (lists) wrote: Frank (lists) wrote: Hi, On a small cluster (3 nodes) I frequently have slow requests. When dumping the inflight ops from the hanging OSD, it seems it doesn't get a 'response' for one of the subops. The events always look like: I've do

Re: [ceph-users] Journal flushed on osd clean shutdown?

2018-06-13 Thread Chris Dunlop
Excellent news - tks! On Wed, Jun 13, 2018 at 11:50:15AM +0200, Wido den Hollander wrote: On 06/13/2018 11:39 AM, Chris Dunlop wrote: Hi, Is the osd journal flushed completely on a clean shutdown? In this case, with Jewel, and FileStore osds, and a "clean shutdown" being: It i

[ceph-users] Journal flushed on osd clean shutdown?

2018-06-13 Thread Chris Dunlop
I want to be more careful there. One option is to simply kill the affected osds and recreate them, and allow the data redundancy to take care of things. However I'm wondering if things should theoretically be ok if I carefully shutdown and restart each of the remaining osds in

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-02-22 Thread Chris Sarginson
single failure domain but were unable to get to deploy additional firmware to test. The Samsung should fit your requirements. http://www.samsung.com/semiconductor/minisite/ssd/product/enterprise/sm863a/ Regards Chris On Thu, 22 Feb 2018 at 12:50 Caspar Smit wrote: > Hi Sean and David, > &g

[ceph-users] ceph mons de-synced from rest of cluster?

2018-02-11 Thread Chris Apsey
ing on the status of the cluster. Has anyone seen this before/know how to sync the mons up to what the OSDs are actually reporting? I see no connectivity errors in the logs of the mons or the osds. Thanks, --- v/r Chris Apsey bitskr...@bitskrieg.net https://www.bit

Re: [ceph-users] Increase recovery / backfilling speed (with many small objects)

2018-01-05 Thread Chris Sarginson
You probably want to consider increasing osd max backfills You should be able to inject this online http://docs.ceph.com/docs/luminous/rados/configuration/osd-config-ref/ You might want to drop your osd recovery max active settings back down to around 2 or 3, although with it being SSD your perf

Re: [ceph-users] Switch to replica 3

2017-11-20 Thread Chris Taylor
On 2017-11-20 3:39 am, Matteo Dacrema wrote: Yes I mean the existing Cluster. SSDs are on a fully separate pool. Cluster is not busy during recovery and deep scrubs but I think it’s better to limit replication in some way when switching to replica 3. My question is to understand if I need to se

Re: [ceph-users] Hammer to Jewel Upgrade - Extreme OSD Boot Time

2017-11-06 Thread Chris Jones
ch to XFS, which is the recommended filesystem for CEPH. XFS does not appear to require any kind of meta cache due to different handling of meta info in the inode. -- Chris From: Willem Jan Withagen Sent: Wedn

Re: [ceph-users] Hammer to Jewel Upgrade - Extreme OSD Boot Time

2017-11-01 Thread Chris Jones
phos.lfn3-alt", 0x7fd4e8017680, 1024) = -1 ENODATA (No data available) <0.18> ------ Christopher J. Jones From: Gregory Farnum Sent: Monday, October 30, 2017 6:20:15 PM To: Chris Jones Cc: ceph-users@lists.ceph.com Subje

Re: [ceph-users] Hammer to Jewel Upgrade - Extreme OSD Boot Time

2017-10-26 Thread Chris Jones
? -- Christopher J. Jones From: Chris Jones Sent: Wednesday, October 25, 2017 12:52:13 PM To: ceph-users@lists.ceph.com Subject: Hammer to Jewel Upgrade - Extreme OSD Boot Time After upgrading from CEPH Hammer to Jewel, we are experiencing extremely

[ceph-users] Hammer to Jewel Upgrade - Extreme OSD Boot Time

2017-10-25 Thread Chris Jones
After upgrading from CEPH Hammer to Jewel, we are experiencing extremely long osd boot duration. This long boot time is a huge concern for us and are looking for insight into how we can speed up the boot time. In Hammer, OSD boot time was approx 3 minutes. After upgrading to Jewel, boot time i

Re: [ceph-users] remove require_jewel_osds flag after upgrade to kraken

2017-07-13 Thread Chris Sarginson
The flag is fine, it's just to ensure that OSDs from a release before Jewel can't be added to the cluster: See http://ceph.com/geen-categorie/v10-2-4-jewel-released/ under "Upgrading from hammer" On Thu, 13 Jul 2017 at 07:59 Jan Krcmar wrote: > hi, > > is it possible to remove the require_jewel

Re: [ceph-users] autoconfigured haproxy service?

2017-07-12 Thread Chris Jones
Hi Sage, The automated tool Cepheus https://github.com/cepheus-io/cepheus does this with ceph-chef. It's based on json data for a given environment. It uses Chef and Ansible. If someone wanted to break out the haproxy (ADC) portion into a package then it has a good model for HAProxy they could loo

[ceph-users] osdmap several thousand epochs behind latest

2017-07-09 Thread Chris Apsey
osd unset noup' (or anything to do after)? Thanks in advance, -- v/r Chris Apsey bitskr...@bitskrieg.net https://www.bitskrieg.net ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Sharing SSD journals and SSD drive choice

2017-04-26 Thread Chris Apsey
lures. Recovering from misplaced objects while also attempting to serve clients is no fun. --- v/r Chris Apsey bitskr...@bitskrieg.net https://www.bitskrieg.net On 2017-04-26 10:53, Adam Carheden wrote: What I'm trying to get from the list is /why/ the "enterprise" drives are im

Re: [ceph-users] Creating journal on needed partition

2017-04-17 Thread Chris Apsey
some control flow. We partition an nvme device and then create symlinks from osds to the partitions in a per-determined fashion. We don't use ceph-desk at all. --- v/r Chris Apsey bitskr...@bitskrieg.net https://www.bitskrieg.net On 2017-04-17 08:56, Nikita Shalnov wrote: Hi all. Is

Re: [ceph-users] saving file on cephFS mount using vi takes pause/time

2017-04-13 Thread Chris Sarginson
Is it related to this the recovery behaviour of vim creating a swap file, which I think nano does not do? http://vimdoc.sourceforge.net/htmldoc/recover.html A sync into cephfs I think needs the write to get confirmed all the way down from the osds performing the write before it returns the confir

Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

2017-03-02 Thread Heller, Chris
Success! There was an issue related to my operating system install procedure that was causing the journals to become corrupt, but it was not caused by ceph! That bug fixed; now the procedure on shutdown in this thread has been verified to work as expected. Thanks for all the help. -Chris >

Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

2017-03-01 Thread Heller, Chris
I see. My journal is specified in ceph.conf. I'm not removing it from the OSD so sounds like flushing isn't needed in my case. -Chris > On Mar 1, 2017, at 9:31 AM, Peter Maloney > wrote: > > On 03/01/17 14:41, Heller, Chris wrote: >> That is a good question, an

Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

2017-03-01 Thread Heller, Chris
That is a good question, and I'm not sure how to answer. The journal is on its own volume, and is not a symlink. Also how does one flush the journal? That seems like an important step when bringing down a cluster safely. -Chris > On Mar 1, 2017, at 8:37 AM, Peter Maloney > wrote:

Re: [ceph-users] Antw: Safely Upgrading OS on a live Ceph Cluster

2017-03-01 Thread Heller, Chris
In my case the version will be identical. But I might have to do this node by node approach if I can't stabilize the more general shutdown/bring-up approach. There are 192 OSD in my cluster, so it will take a while to go node by node unfortunately. -Chris > On Mar 1, 2017, at 2:50 AM,

Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

2017-02-28 Thread Heller, Chris
ceph osd stat` shows that the 'norecover' flag is still set. I'm going to wait out the recovery and see if the Ceph FS is OK. That would be huge if it is. But I am curious why I lost an OSD, and why recovery is happening with 'norecover' still set. -Chris > O

[ceph-users] Safely Upgrading OS on a live Ceph Cluster

2017-02-27 Thread Heller, Chris
(clone()+0x6d) [0x7f31d919c51d] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. How can I safely stop a Ceph cluster, so that it will cleanly start back up again? -Chris smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] civetweb deamon dies on https port

2017-01-19 Thread Chris Sarginson
You look to have a typo in this line: rgw_frontends = "civetweb port=8080s ssl_certificate=/etc/pki/tls/ cephrgw01.crt" It would seem from the error it should be port=8080, not port=8080s. On Thu, 19 Jan 2017 at 08:59 Iban Cabrillo wrote: > Dear cephers, >I just finish the integration betw

[ceph-users] Ceph.com

2017-01-16 Thread Chris Jones
The site looks great! Good job! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph Monitoring

2017-01-13 Thread Chris Jones
-- > > If you are not the intended recipient of this message or received it > erroneously, please notify the sender and delete it, together with any > attachments, and be advised that any dissemination or copying of this > message is prohibited. > > --

[ceph-users] Ceph Monitoring

2017-01-13 Thread Chris Jones
ust trying to get a pulse of what others are doing. Thanks in advance. -- Best Regards, Chris Jones ​Bloomberg​ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Storage system

2017-01-04 Thread Chris Jones
Based on this limited info, Object storage if behind proxy. We use Ceph behind HAProxy and hardware load-balancers at Bloomberg. Our Chef recipes are at https://github.com/ceph/ceph-chef and https://github.com/bloomberg/chef-bcs. The chef-bcs cookbooks show the HAProxy info. Thanks, Chris On Wed

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-09 Thread Chris Sarginson
starting point for proceeding with that: http://ceph-users.ceph.narkive.com/EfFTUPyP/how-to-fix-the-mds-damaged-issue Chris On Fri, 9 Dec 2016 at 19:26 Goncalo Borges wrote: > Hi Sean, Rob. > > I saw on the tracker that you were able to resolve the mds assert by > manually cleani

Re: [ceph-users] rgw civetweb ssl official documentation?

2016-12-07 Thread Chris Jones
> > > > -JP > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Best Regards, Chris Jones cjo...@cloudm2.com (p) 770.655.0770 ___ ceph-user

Re: [ceph-users] How are replicas spread in default crush configuration?

2016-11-23 Thread Chris Taylor
have not had to make that change before so you will want to read up on it first. Don't take my word for it. http://docs.ceph.com/docs/master/rados/operations/crush-map/#crush-map-parameters [3] Hope that helps. Chris On 2016-11-23 1:32 pm, Kevin Olbrich wrote: > Hi, > > ju

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-14 Thread Chris Taylor
Maybe a long shot, but have you checked OSD memory usage? Are the OSD hosts low on RAM and swapping to disk? I am not familiar with your issue, but though that might cause it. Chris On 2016-11-14 3:29 pm, Brad Hubbard wrote: > Have you looked for clues in the output of dump_historic_

Re: [ceph-users] cephfs slow delete

2016-10-14 Thread Heller, Chris
Just a thought, but since a directory tree is a first class item in cephfs, could the wire protocol be extended with an “recursive delete” operation, specifically for cases like this? On 10/14/16, 4:16 PM, "Gregory Farnum" wrote: On Fri, Oct 14, 2016 at 1:11 PM, Heller, Ch

Re: [ceph-users] cephfs slow delete

2016-10-14 Thread Heller, Chris
Ok. Since I’m running through the Hadoop/ceph api, there is no syscall boundary so there is a simple place to improve the throughput here. Good to know, I’ll work on a patch… On 10/14/16, 3:58 PM, "Gregory Farnum" wrote: On Fri, Oct 14, 2016 at 11:41 AM, Heller, Ch

Re: [ceph-users] cephfs slow delete

2016-10-14 Thread Heller, Chris
? -Chris On 10/13/16, 4:22 PM, "Gregory Farnum" wrote: On Thu, Oct 13, 2016 at 12:44 PM, Heller, Chris wrote: > I have a directory I’ve been trying to remove from cephfs (via > cephfs-hadoop), the directory is a few hundred gigabytes in size and > contains a few

Re: [ceph-users] Stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"

2016-10-13 Thread Chris Murray
On 13/10/2016 11:49, Henrik Korkuc wrote: Is apt/dpkg doing something now? Is problem repeatable, e.g. by killing upgrade and starting again. Are there any stuck systemctl processes? I had no problems upgrading 10.2.x clusters to 10.2.3 On 16-10-13 13:41, Chris Murray wrote: On 22/09/2016

[ceph-users] cephfs slow delete

2016-10-13 Thread Heller, Chris
better debug this scenario? -Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"

2016-10-13 Thread Chris Murray
On 22/09/2016 15:29, Chris Murray wrote: Hi all, Might anyone be able to help me troubleshoot an "apt-get dist-upgrade" which is stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"? I'm upgrading from 10.2.2. The two OSDs on this node are up, and think they are versio

Re: [ceph-users] New OSD Nodes, pgs haven't changed state

2016-10-11 Thread Chris Taylor
I see on this list often that peering issues are related to networking and MTU sizes. Perhaps the HP 5400's or the managed switches did not have jumbo frames enabled? Hope that helps you determine the issue in case you want to move the nodes back to the other location. Chris On 2016-

[ceph-users] Stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"

2016-09-22 Thread Chris Murray
r to be finishing ... ? Thank you in advance, Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Faulting MDS clients, HEALTH_OK

2016-09-21 Thread Heller, Chris
uot;: false, "inst": "client.585194220 192.168.1.157:0\/634334964", "client_metadata": { "ceph_sha1": "d56bdf93ced6b80b07397d57e3fa68fe68304432", "ceph_version": "ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3f

Re: [ceph-users] Faulting MDS clients, HEALTH_OK

2016-09-21 Thread Heller, Chris
I also went and bumped mds_cache_size up to 1 million… still seeing cache pressure, but I might just need to evict those clients… On 9/21/16, 9:24 PM, "Heller, Chris" wrote: What is the interesting value in ‘session ls’? Is it ‘num_leases’ or ‘num_caps’ leases appears to be, on

Re: [ceph-users] Faulting MDS clients, HEALTH_OK

2016-09-21 Thread Heller, Chris
What is the interesting value in ‘session ls’? Is it ‘num_leases’ or ‘num_caps’ leases appears to be, on average, 1. But caps seems to be 16385 for many many clients! -Chris On 9/21/16, 9:22 PM, "Gregory Farnum" wrote: On Wed, Sep 21, 2016 at 6:13 PM, Heller, Chris wrote:

Re: [ceph-users] Faulting MDS clients, HEALTH_OK

2016-09-21 Thread Heller, Chris
the ceph tools. I’ll try upping “mds cache size”, are there any other configuration settings I might adjust to perhaps ease the problem while I track it down in the HDFS tools layer? -Chris On 9/21/16, 4:34 PM, "Gregory Farnum" wrote: On Wed, Sep 21, 2016 at 1:16 PM, Heller, Ch

Re: [ceph-users] Faulting MDS clients, HEALTH_OK

2016-09-21 Thread Heller, Chris
ce3b626700 3 mds.0.server handle_client_session client_session(request_renewcaps seq 364) v1 from client.491885178 2016-09-21 20:15:58.159134 7fce3b626700 3 mds.0.server handle_client_session client_session(request_renewcaps seq 364) v1 from client.491885188 -Chris On 9/21/16, 11:23 AM, "Heller, Chris&q

[ceph-users] Ceph Rust Librados

2016-09-21 Thread Chris Jones
ccess from Rust: (Supports V2 and V4 signatures) Crate: aws-sdk-rust - https://github.com/lambdastackio/aws-sdk-rust Thanks, Chris Jones ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Faulting MDS clients, HEALTH_OK

2016-09-21 Thread Heller, Chris
9baa8780 -1 mds.-1.0 log_to_monitors {default=true} 2016-09-21 15:13:27.329181 7f68969e9700 1 mds.-1.0 handle_mds_map standby 2016-09-21 15:13:28.484148 7f68969e9700 1 mds.-1.0 handle_mds_map standby 2016-09-21 15:13:33.280376 7f68969e9700 1 mds.-1.0 handle_mds_map standby On 9/21/16, 10:

Re: [ceph-users] Faulting MDS clients, HEALTH_OK

2016-09-21 Thread Heller, Chris
the summary). -Chris On 9/21/16, 10:46 AM, "Gregory Farnum" wrote: On Wed, Sep 21, 2016 at 6:30 AM, Heller, Chris wrote: > I’m running a production 0.94.7 Ceph cluster, and have been seeing a > periodic issue arise where in all my MDS clients will become stuck, an

[ceph-users] Faulting MDS clients, HEALTH_OK

2016-09-21 Thread Heller, Chris
point: e51bed37327a676e9974d740a13e173f11d1a11fdba5fbcf963b62023b06d7e8 mdscachedump.txt.gz (https://filetea.me/t1sz3XPHxEVThOk8tvVTK5Bsg) -Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to associate a cephfs client id to its process

2016-09-14 Thread Heller, Chris
Ok. I’ll see about tracking down the logs (set to stderr for these tasks), and the metadata stuff looks interesting for future association. Thanks, Chris On 9/14/16, 5:04 PM, "Gregory Farnum" wrote: On Wed, Sep 14, 2016 at 7:02 AM, Heller, Chris wrote: > I am making

[ceph-users] How to associate a cephfs client id to its process

2016-09-14 Thread Heller, Chris
can be running on the same host, its not obvious how to associate the client ‘id’ as reported by ‘session ls’ with any one process on the give host. Is there steps I can follow to back track the client ‘id’ to a process id? -Chris ___ ceph-users mailing

Re: [ceph-users] pools per hypervisor?

2016-09-11 Thread Chris Taylor
each hypervisor but that would be up to you. Chris > On Sep 11, 2016, at 9:04 PM, Thomas wrote: > > Hi Guys, > > Hoping to find help here as I can't seem to find anything on the net. > > I have a ceph cluster and I'd want to use rbd as block storage on our

[ceph-users] Ceph auth key generation algorithm documentation

2016-08-23 Thread Heller, Chris
I’d like to generate keys for ceph external to any system which would have ceph-authtool. Looking over the ceph website and googling have turned up nothing. Is the ceph auth key generation algorithm documented anywhere? -Chris ___ ceph-users mailing

Re: [ceph-users] Signature V2

2016-08-18 Thread Chris Jones
betterbe.com | www.betterbe.com > > This e-mail is intended exclusively for the addressee(s), and may not > be passed on to, or made available for use by any person other than > the addressee(s). Better.be B.V. rules out any and every liability > resulting from any electronic transmission. > > ___

Re: [ceph-users] PG is in 'stuck unclean' state, but all acting OSD are up

2016-08-16 Thread Heller, Chris
marked as ‘found’ once it returns to the network? -Chris From: Goncalo Borges Date: Monday, August 15, 2016 at 11:36 PM To: "Heller, Chris" , "ceph-users@lists.ceph.com" Subject: Re: [ceph-users] PG is in 'stuck unclean' state, but all acting OSD are up Hi Chris..

Re: [ceph-users] PG is in 'stuck unclean' state, but all acting OSD are up

2016-08-15 Thread Heller, Chris
Output of `ceph pg dump_stuck` # ceph pg dump_stuck ok pg_stat state up up_primary acting acting_primary 4.2a8 down+peering[79,8,74] 79 [79,8,74] 79 4.c3down+peering[56,79,67] 56 [56,79,67] 56 -Chris From: Goncalo Borges Date: Monday

[ceph-users] PG is in 'stuck unclean' state, but all acting OSD are up

2016-08-15 Thread Heller, Chris
unstick it, given that all the acting OSD are up and in? (* Re-sent, now that I’m subscribed to list *) -Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RGW pools type

2016-06-12 Thread Chris Jones
> ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Best Regards, Chris Jones cjo...@cloudm2.com (p) 770.655.0770 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Encryption for data at rest support

2016-06-02 Thread chris holcombe
When combined with the ceph-mon charm you're up and running fast :) -Chris On 06/02/2016 03:57 AM, M Ranga Swami Reddy wrote: > Hello, > > Can you please share if the ceph supports the "data at rest" functionality? > If yes, how can I achieve this? Please share any doc

[ceph-users] Ceph API Announcement

2016-06-01 Thread chris holcombe
having a problem with the formatting of an mds command. I'm going to submit a PR to master to fix it. -Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Increasing pg_num

2016-05-16 Thread Chris Dunlop
Hi Christian, On Tue, May 17, 2016 at 10:41:52AM +0900, Christian Balzer wrote: > On Tue, 17 May 2016 10:47:15 +1000 Chris Dunlop wrote: > Most your questions would be easily answered if you did spend a few > minutes with even the crappiest test cluster and observing things (with >

Re: [ceph-users] Increasing pg_num

2016-05-16 Thread Chris Dunlop
On Mon, May 16, 2016 at 10:40:47PM +0200, Wido den Hollander wrote: > > Op 16 mei 2016 om 7:56 schreef Chris Dunlop : > > Why do we have both pg_num and pgp_num? Given the docs say "The pgp_num > > should be equal to the pg_num": under what circumstances might you wan

Re: [ceph-users] Increasing pg_num

2016-05-16 Thread Chris Dunlop
On Tue, May 17, 2016 at 08:21:48AM +0900, Christian Balzer wrote: > On Mon, 16 May 2016 22:40:47 +0200 (CEST) Wido den Hollander wrote: > > > > pg_num is the actual amount of PGs. This you can increase without any > > actual data moving. > > Yes and no. > > Increasing the pg_num will split PGs, w

Re: [ceph-users] v0.94.7 Hammer released

2016-05-15 Thread Chris Dunlop
15 15:32: ceph-common_0.94.5-1~bpo70+1_amd64.deb 11-May-2016 15:57 9868188 Cheers, Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Increasing pg_num

2016-05-15 Thread Chris Dunlop
y a small increment, increase pgp_num to match, repeat until target reached", or is that no advantage to increasing pg_num (in multiple small increments or single large step) to the target, then increasing pgp_num in small increments to the target - and why? Given that increasing pg_num/pgp_nu

Re: [ceph-users] Maximum MON Network Throughput Requirements

2016-05-02 Thread Chris Jones
sts.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Best Regards, Chris Jones cjo...@cloudm2.com (p) 770.655.0770 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] adding cache tier in productive hammer environment

2016-04-07 Thread Chris Taylor
osd-snap-trim-sleep 0.1' Recovery may take a little longer while backfilling, but the cluster is still responsive and we have happy VMs now. I've collected these from various posts from the ceph-users list. Maybe they will help you if you haven't tried them already. Chris On

[ceph-users] OSD mounts without BTRFS compression

2016-03-26 Thread Chris Murray
btrfs (rw,noatime,nodiratime,compress-force=lzo,space_cache,subvolid=5,subvol= /) Where should I look next? I'm on 0.94.6 Thanks in advance, Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/

Re: [ceph-users] v0.94.6 Hammer released

2016-03-22 Thread Chris Dunlop
On Wed, Mar 23, 2016 at 01:22:45AM +0100, Loic Dachary wrote: > On 23/03/2016 01:12, Chris Dunlop wrote: >> On Wed, Mar 23, 2016 at 01:03:06AM +0100, Loic Dachary wrote: >>> On 23/03/2016 00:39, Chris Dunlop wrote: >>>> "The old OS'es" that were

Re: [ceph-users] v0.94.6 Hammer released

2016-03-22 Thread Chris Dunlop
Hi Loïc, On Wed, Mar 23, 2016 at 01:03:06AM +0100, Loic Dachary wrote: > On 23/03/2016 00:39, Chris Dunlop wrote: >> "The old OS'es" that were being supported up to v0.94.5 includes debian >> wheezy. It would be quite surprising and unexpected to drop support f

Re: [ceph-users] v0.94.6 Hammer released

2016-03-22 Thread Chris Dunlop
Hi Loïc, On Wed, Mar 23, 2016 at 12:14:27AM +0100, Loic Dachary wrote: > On 22/03/2016 23:49, Chris Dunlop wrote: >> Hi Stable Release Team for v0.94, >> >> Let's try again... Any news on a release of v0.94.6 for debian wheezy >> (bpo70)? > > I don'

Re: [ceph-users] v0.94.6 Hammer released

2016-03-22 Thread Chris Dunlop
Hi Stable Release Team for v0.94, Let's try again... Any news on a release of v0.94.6 for debian wheezy (bpo70)? Cheers, Chris On Thu, Mar 17, 2016 at 12:43:15PM +1100, Chris Dunlop wrote: > Hi Chen, > > On Thu, Mar 17, 2016 at 12:40:28AM +, Chen, Xiaoxi wrote: >> It

Re: [ceph-users] v0.94.6 Hammer released

2016-03-19 Thread Chris Dunlop
Hi Stable Release Team for v0.94, On Thu, Mar 10, 2016 at 11:00:06AM +1100, Chris Dunlop wrote: > On Wed, Mar 02, 2016 at 06:32:18PM +0700, Loic Dachary wrote: >> I think you misread what Sage wrote : "The intention was to >> continue building stable releases (0.94.x

Re: [ceph-users] v0.94.6 Hammer released

2016-03-19 Thread Chris Dunlop
Hi Chen, On Thu, Mar 17, 2016 at 12:40:28AM +, Chen, Xiaoxi wrote: > It’s already there, in > http://download.ceph.com/debian-hammer/pool/main/c/ceph/. I can only see ceph*_0.94.6-1~bpo80+1_amd64.deb there. Debian wheezy would be bpo70. Cheers, Chris > On 3/17/16, 7:20 AM, &quo

Re: [ceph-users] v0.94.6 Hammer released

2016-03-09 Thread Chris Dunlop
he old OS'es are still supported. Their absence is a > glitch in the release process that will be fixed. Any news on a release of v0.94.6 for debian wheezy? Cheers, Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Restrict cephx commands

2016-03-01 Thread chris holcombe
Hey Ceph Users! I'm wondering if it's possible to restrict the ceph keyring to only being able to run certain commands. I think the answer to this is no but I just wanted to ask. I haven't seen any documentation indicating whether or not this is possible. Anyone know?

Re: [ceph-users] v0.94.6 Hammer released

2016-03-01 Thread Chris Dunlop
Hi, The "old list of supported platforms" includes debian wheezy. Will v0.94.6 be built for this? Chris On Mon, Feb 29, 2016 at 10:57:53AM -0500, Sage Weil wrote: > The intention was to continue building stable releases (0.94.x) on the old > list of supported platforms (which i

[ceph-users] Another corruption detection/correction question - exposure between 'event' and 'repair'?

2015-12-23 Thread Chris Murray
*after* a repair? I'm obviously hoping for an eventual scenario where this is all transparent to the ZFS layer and it stops detecting checksum errors :) Thanks, Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] pg stuck in peering state

2015-12-18 Thread Chris Dunlop
nd osds 2 and 3 (including the MTU). Cheers, Chris On Fri, Dec 18, 2015 at 02:50:18PM +0100, Reno Rainz wrote: > Hi all, > > I reboot all my osd node after, I got some pg stuck in peering state. > > root@ceph-osd-3:/var/log/ceph# ceph -s > cluster 186717a6-bf80-420

Re: [ceph-users] Cephfs: large files hang

2015-12-17 Thread Chris Dunlop
Hi Bryan, Have you checked your MTUs? I was recently bitten by large packets not getting through where small packets would. (This list, Dec 14, "All pgs stuck peering".) Small files working but big files not working smells like it could be a similar problem. Cheers, Chris On Thu, De

Re: [ceph-users] Deploying a Ceph storage cluster using Warewulf on Centos-7

2015-12-17 Thread Chris Jones
Hi Chu, If you can use Chef then: https://github.com/ceph/ceph-chef An example of an actual project can be found at: https://github.com/bloomberg/chef-bcs Chris On Wed, Sep 23, 2015 at 4:11 PM, Chu Ruilin wrote: > Hi, all > > I don't know which automation tool is best for depl

Re: [ceph-users] All pgs stuck peering

2015-12-14 Thread Chris Dunlop
agine you wouldn't want it doing a big packet test every heartbeat, perhaps every 10th or some configurable number. Something for the developers to consider? (cc'ed) Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.co

Re: [ceph-users] All pgs stuck peering

2015-12-13 Thread Chris Dunlop
gerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 AUGHHH! That was it. Thanks Robert and Varada! Cheers, Chris ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

  1   2   3   >