Yesterday I went through manually configuring a ceph cluster with a
rados gateway on centos 6.5, and I have a question about the
documentation. On this page:
https://ceph.com/docs/master/radosgw/config/
It mentions "On CentOS/RHEL distributions, turn off print continue. If
you have it set to tru
> Any help would be greatly appreciated.
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
--
[image: Photobucket] <http://photobucket.com>
wrote:
> Bryan,
>
> Good explanation. How's performance now that you've spread the load over
> multiple buckets?
>
> Mark
>
> On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
>
>> Bill,
>>
>> I've run into a similar issue with objects averagi
;
>>> Bryan
>>>
>>>
>>> On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
>>> mailto:mark.nel...@inktank.com>> wrote:
>>>
>>> Bryan,
>>>
>>> Good explanation. How's performance now that you'v
that things have slowed down a bit.
The average upload rate over those first 20 hours was ~48
objects/second, but now I'm only seeing ~20 objects/second. This is
with 18,836 buckets.
Bryan
On Wed, Sep 4, 2013 at 12:43 PM, Bryan Stillwell
wrote:
> So far I haven't seen much of a c
but they are not the
> easiest tools to setup/use).
>
> Mark
>
> On 09/05/2013 11:59 AM, Bryan Stillwell wrote:
>>
>> Mark,
>>
>> Yesterday I blew away all the objects and restarted my test using
>> multiple buckets, and things are definitely better!
>&
This appears to be more of an XFS issue than a ceph issue, but I've
run into a problem where some of my OSDs failed because the filesystem
was reported as full even though there was 29% free:
[root@den2ceph001 ceph-1]# touch blah
touch: cannot touch `blah': No space left on device
[root@den2ceph00
ag -r /dev/sdc1
actual 3481543, ideal 3447443, fragmentation factor 0.98%
Bryan
On Mon, Oct 14, 2013 at 4:35 PM, Michael Lowe wrote:
>
> How fragmented is that file system?
>
> Sent from my iPad
>
> > On Oct 14, 2013, at 5:44 PM, Bryan Stillwell
> > wrote:
> >
>
x27;m wondering is if reducing the block size from 4K to 2K (or 1K)
would help? I'm pretty sure this would take require re-running
mkfs.xfs on every OSD to fix if that's the case...
Thanks,
Bryan
On Mon, Oct 14, 2013 at 5:28 PM, Bryan Stillwell
wrote:
>
> The filesystem isn&
]
osd_mount_options_xfs = "rw,noatime,inode64"
osd_mkfs_options_xfs = "-f -b size=2048"
The cluster is currently running the 0.71 release.
Bryan
On Mon, Oct 21, 2013 at 2:39 PM, Bryan Stillwell
wrote:
> So I'm running into this issue again and after spending a bit
9
>
> ____
> From: ceph-users-boun...@lists.ceph.com [ceph-users-boun...@lists.ceph.com]
> on behalf of Bryan Stillwell [bstillw...@photobucket.com]
> Sent: Wednesday, October 30, 2013 2:18 PM
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-use
free blocks 91693664
average free extent size 44.7352
That gives me a little more confidence in using 2K block sizes now. :)
Bryan
On Thu, Oct 31, 2013 at 11:02 AM, Bryan Stillwell
wrote:
> Shain,
>
> After getting the segfaults when running 'xfs_db -r "-c freesp -s"'
While updating my cluster to use a 2K block size for XFS, I've run
into a couple OSDs failing to start because of corrupted journals:
=== osd.1 ===
-10> 2013-11-12 13:40:35.388177 7f030458a7a0 1
filestore(/var/lib/ceph/osd/ceph-1) mount detected xfs
-9> 2013-11-12 13:40:35.388194 7f030458a
On Tue, Mar 5, 2013 at 12:44 PM, Kevin Decherf wrote:
>
> On Tue, Mar 05, 2013 at 12:27:04PM -0600, Dino Yancey wrote:
> > The only two features I'd deem necessary for our workload would be
> > stable distributed metadata / MDS and a working fsck equivalent.
> > Snapshots would be great once the f
I have two test clusters running Bobtail (0.56.4) and Ubuntu Precise
(12.04.2). The problem I'm having is that I'm not able to get either
of them into a state where I can both mount the filesystem and have
all the PGs in the active+clean state.
It seems that on both clusters I can get them into a
1 PM, John Wilkins wrote:
> Bryan,
>
> It seems you got crickets with this question. Did you get any further? I'd
> like to add it to my upcoming CRUSH troubleshooting section.
>
>
> On Wed, Apr 3, 2013 at 9:27 AM, Bryan Stillwell <
> bstillw...@photobucket.com> wrote:
//ceph.com
>
>
> On Thu, Apr 18, 2013 at 12:51 PM, John Wilkins
> wrote:
> > Bryan,
> >
> > It seems you got crickets with this question. Did you get any further?
> I'd
> > like to add it to my upcoming CRUSH troubleshooting section.
> >
&
the tunables. In setups where your branching factors aren't very close to
> your replication counts they aren't normally needed, if you want to reshape
> your cluster a little bit.
> -Greg
>
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>
> On
I've run into an issue where after copying a file to my cephfs cluster
the md5sums no longer match. I believe I've tracked it down to some
parts of the file which are missing:
$ obj_name=$(cephfs "title1.mkv" show_location -l 0 | grep object_name
| sed -e "s/.*:\W*\([0-9a-f]*\)\.[0-9a-f]*/\1/")
$
:
> On Tue, Apr 23, 2013 at 11:38 AM, Bryan Stillwell
> wrote:
>> I've run into an issue where after copying a file to my cephfs cluster
>> the md5sums no longer match. I believe I've tracked it down to some
>> parts of the file which are missing:
>>
>
hout the debugfs stuff
> being enabled. :/
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>
> On Tue, Apr 23, 2013 at 3:00 PM, Bryan Stillwell
> wrote:
>> I've tried a few different ones:
>>
>> 1. cp to cephfs mounted filesystem on
e, Apr 23, 2013 at 4:41 PM, Gregory Farnum wrote:
> On Tue, Apr 23, 2013 at 3:37 PM, Bryan Stillwell
> wrote:
>> I'm using the kernel client that's built into precise & quantal.
>>
>> I could give the ceph-fuse client a try and see if it has the same
>>
On Tue, Apr 23, 2013 at 5:24 PM, Sage Weil wrote:
>
> On Tue, 23 Apr 2013, Bryan Stillwell wrote:
> > I'm testing this now, but while going through the logs I saw something
> > that might have something to do with this:
> >
> > Apr 23 16:35:28 a1 kernel: [
On Tue, Apr 23, 2013 at 5:45 PM, Sage Weil wrote:
> On Tue, 23 Apr 2013, Bryan Stillwell wrote:
>> On Tue, Apr 23, 2013 at 5:24 PM, Sage Weil wrote:
>> >
>> > On Tue, 23 Apr 2013, Bryan Stillwell wrote:
>> > > I'm testing this now, but while going
On Tue, Apr 23, 2013 at 5:54 PM, Gregory Farnum wrote:
> On Tue, Apr 23, 2013 at 4:45 PM, Sage Weil wrote:
>> On Tue, 23 Apr 2013, Bryan Stillwell wrote:
>>> On Tue, Apr 23, 2013 at 5:24 PM, Sage Weil wrote:
>>> >
>>> > On Tue, 23 Apr 2013, Bryan Sti
With the release of cuttlefish, I decided to try out ceph-deploy and
ran into some documentation errors along the way:
http://ceph.com/docs/master/rados/deployment/preflight-checklist/
Under 'CREATE A USER' it has the following line:
To provide full privileges to the user, add the following to
I attempted to upgrade my bobtail cluster to cuttlefish tonight and I
believe I'm running into some mon related issues. I did the original
install manually instead of with mkcephfs or ceph-deploy, so I think
that might have to do with this error:
root@a1:~# ceph-mon -d -c /etc/ceph/ceph.conf
2013
On Thu, May 23, 2013 at 9:58 AM, Smart Weblications GmbH - Florian
Wiessner wrote:
> you may need to update your [mon.a] section in your ceph.conf like this:
>
>
> [mon.a]
>mon data = /var/lib/ceph/mon/ceph-a/
That didn't seem to make a difference, it kept trying to use ceph-admin.
I tri
Shortly after upgrading from bobtail to cuttlefish I tried increasing
the number of monitors in my small test cluster from 1 to 3, but I
believe I messed something up in the process. At first I thought the
conversion to leveldb failed, but after digging into it a bit I
believe this explains it:
#
"a",
"addr": "172.24.88.50:6789\/0"},
{ "rank": 1,
"name": "mon.b",
"addr": "172.24.88.53:6789\/0"}]}}
Any ideas how to get rid of mon.b?
Thanks,
Bryan
On
I have a cluster I originally built on argonaut and have since
upgraded it to bobtail and then cuttlefish. I originally configured
it with one node for both the mds node and mon node, and 4 other nodes
for hosting osd's:
a1: mon.a/mds.a
b1: osd.0, osd.1, osd.2, osd.3, osd.4, osd.20
b2: osd.5, osd
On Tue, Jun 11, 2013 at 3:50 PM, Gregory Farnum wrote:
> You should not run more than one active MDS (less stable than a
> single-MDS configuration, bla bla bla), but you can run multiple
> daemons and let the extras serve as a backup in case of failure. The
> process for moving an MDS is pretty e
I'm in the process of cleaning up a test that an internal customer did on our
production cluster that produced over a billion objects spread across 6000
buckets. So far I've been removing the buckets like this:
printf %s\\n bucket{1..6000} | xargs -I{} -n 1 -P 32 radosgw-admin bucket rm
--buck
Wouldn't doing it that way cause problems since references to the objects
wouldn't be getting removed from .rgw.buckets.index?
Bryan
From: Roger Brown
Date: Monday, July 24, 2017 at 2:43 PM
To: Bryan Stillwell , "ceph-users@lists.ceph.com"
Subject: Re: [ceph-users]
nable amount of time.
Thanks,
Bryan
From: Pavan Rallabhandi
Date: Tuesday, July 25, 2017 at 3:00 AM
To: Bryan Stillwell , "ceph-users@lists.ceph.com"
Subject: Re: [ceph-users] Speeding up garbage collection in RGW
If your Ceph version is >=Jewel, you can try the `--bypass-gc` opti
Excellent, thank you! It does exist in 0.94.10! :)
Bryan
From: Pavan Rallabhandi
Date: Tuesday, July 25, 2017 at 11:21 AM
To: Bryan Stillwell , "ceph-users@lists.ceph.com"
Subject: Re: [ceph-users] Speeding up garbage collection in RGW
I’ve just realized that the option is
Dan,
We recently went through an expansion of an RGW cluster and found that we
needed 'norebalance' set whenever making CRUSH weight changes to avoid slow
requests. We were also increasing the CRUSH weight by 1.0 each time which
seemed to reduce the extra data movement we were seeing with smal
I was reading this post by Josh Durgin today and was pretty happy to see we can
get a summary of features that clients are using with the 'ceph features'
command:
http://ceph.com/community/new-luminous-upgrade-complete/
However, I haven't found an option to display the IP address of those clien
On 09/07/2017 10:47 AM, Josh Durgin wrote:
> On 09/06/2017 04:36 PM, Bryan Stillwell wrote:
> > I was reading this post by Josh Durgin today and was pretty happy to
> > see we can get a summary of features that clients are using with the
> > 'ceph features' c
For about a week we've been seeing a decent number of buffer overflows
detected across all our RGW nodes in one of our clusters. This started
happening a day after we started weighing in some new OSD nodes, so
we're thinking it's probably related to that. Could someone help us
determine the root
On 09/07/2017 01:26 PM, Josh Durgin wrote:
> On 09/07/2017 11:31 AM, Bryan Stillwell wrote:
>> On 09/07/2017 10:47 AM, Josh Durgin wrote:
>>> On 09/06/2017 04:36 PM, Bryan Stillwell wrote:
>>>> I was reading this post by Josh Durgin today and was pretty happy to
&g
lf of Bryan
Stillwell
Date: Friday, September 8, 2017 at 9:26 AM
To: ceph-users
Subject: [ceph-users] radosgw crashing after buffer overflows detected
[This sender failed our fraud detection checks and may not be who they appear
to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing]
For
e --bypass-gc option to avoid the cleanup, but is there a
way to speed up the gc once you're in this position? There were about 8M
objects that were deleted from this bucket. I've come across a few references
to the rgw-gc settings in the config, but nothing that explained the times w
ryan
From: Yehuda Sadeh-Weinraub
Date: Wednesday, October 25, 2017 at 11:32 AM
To: Bryan Stillwell
Cc: David Turner , Ben Hines ,
"ceph-users@lists.ceph.com"
Subject: Re: [ceph-users] Speeding up garbage collection in RGW
Some of the options there won't do much for you as they
On Wed, Oct 25, 2017 at 4:02 PM, Yehuda Sadeh-Weinraub
wrote:
>
> On Wed, Oct 25, 2017 at 2:32 PM, Bryan Stillwell
> wrote:
> > That helps a little bit, but overall the process would take years at this
> > rate:
> >
> > # for i in {1..3600}; do ceph df -f jso
As mentioned in another thread I'm trying to remove several thousand buckets on
a hammer cluster (0.94.10), but I'm running into a problem using --bypass-gc.
I usually see either this error:
# radosgw-admin bucket rm --bucket=sg2pl598 --purge-objects --bypass-gc
2017-10-31 09:21:04.111599 7f45f5
We're looking into switching the failure domains on several of our
clusters from host-level to rack-level and I'm trying to figure out the
least impactful way to accomplish this.
First off, I've made this change before on a couple large (500+ OSDs)
OpenStack clusters where the volumes, images, and
Bryan,
Based off the information you've provided so far, I would say that your largest
pool still doesn't have enough PGs.
If you originally had only 512 PGs for you largest pool (I'm guessing
.rgw.buckets has 99% of your data), then on a balanced cluster you would have
just ~11.5 PGs per OSD
It may work fine, but I would suggest limiting the number of operations going
on at the same time.
Bryan
From: Bryan Banister
Date: Tuesday, February 13, 2018 at 1:16 PM
To: Bryan Stillwell , Janne Johansson
Cc: Ceph Users
Subject: RE: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2
I decided to upgrade my home cluster from Luminous (v12.2.7) to Mimic (v13.2.1)
today and ran into a couple issues:
1. When restarting the OSDs during the upgrade it seems to forget my upmap
settings. I had to manually return them to the way they were with commands
like:
ceph osd pg-upmap-ite
> On 08/30/2018 11:00 AM, Joao Eduardo Luis wrote:
> > On 08/30/2018 09:28 AM, Dan van der Ster wrote:
> > Hi,
> > Is anyone else seeing rocksdb mon stores slowly growing to >15GB,
> > eventually triggering the 'mon is using a lot of disk space' warning?
> > Since upgrading to luminous, we've seen
After we upgraded from Jewel (10.2.10) to Luminous (12.2.5) we started seeing a
problem where the new ceph-mgr would sometimes hang indefinitely when doing
commands like 'ceph pg dump' on our largest cluster (~1,300 OSDs). The rest of
our clusters (10+) aren't seeing the same issue, but they ar
I left some of the 'ceph pg dump' commands running and twice they returned
results after 30 minutes, and three times it took 45 minutes. Is there
something that runs every 15 minutes that would let these commands finish?
Bryan
From: Bryan Stillwell
Date: Thursday, October 18, 201
t. Anyone know the reasoning
for that decision?
Bryan
From: Dan van der Ster
Date: Thursday, October 18, 2018 at 2:03 PM
To: Bryan Stillwell
Cc: ceph-users
Subject: Re: [ceph-users] ceph-mgr hangs on larger clusters in Luminous
15 minutes seems like the ms tcp read timeout would be rel
collectd which is running
'ceph pg dump' every 16-17 seconds. I guess you could say we're stress testing
that code path fairly well... :)
Bryan
On Thu, Oct 18, 2018 at 6:17 PM Bryan Stillwell
mailto:bstillw...@godaddy.com>> wrote:
After we upgraded from Jewel (10.
[mailto:drakonst...@gmail.com]
Sent: Friday, February 16, 2018 3:21 PM
To: Bryan Banister mailto:bbanis...@jumptrading.com>>
Cc: Bryan Stillwell mailto:bstillw...@godaddy.com>>;
Janne Johansson mailto:icepic...@gmail.com>>; Ceph Users
mailto:ceph-users@lists.ceph.com>>
We recently began our upgrade testing for going from Jewel (10.2.10) to
Luminous (12.2.5) on our clusters. The first part of the upgrade went
pretty smoothly (upgrading the mon nodes, adding the mgr nodes, upgrading
the OSD nodes), however, when we got to the RGWs we started seeing internal
server
> We have a large 1PB ceph cluster. We recently added 6 nodes with 16 2TB disks
> each to the cluster. All the 5 nodes rebalanced well without any issues and
> the sixth/last node OSDs started acting weird as I increase weight of one osd
> the utilization doesn't change but a different osd on the s
This has come up quite a few times before, but since I was only working with
RBD before I didn't pay too close attention to the conversation. I'm looking
for the best way to handle existing clusters that have buckets with a large
number of objects (>20 million) in them. The cluster I'm doing test
Is this on an RGW cluster?
If so, you might be running into the same problem I was seeing with large
bucket sizes:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018504.html
The solution is to shard your buckets so the bucket index doesn't get too big.
Bryan
From: ceph-users o
I have a cluster running 10.2.7 that is seeing some extremely large directory
sizes in CephFS according to the recursive stats:
$ ls -lhd Originals/
drwxrwxr-x 1 bryan bryan 16E Jun 13 13:27 Originals/
du reports a much smaller (and accurate) number:
$ du -sh Originals/
300GOriginals/
This
On 6/15/17, 9:20 AM, "John Spray" wrote:
>
> On Wed, Jun 14, 2017 at 4:31 PM, Bryan Stillwell
> wrote:
> > I have a cluster running 10.2.7 that is seeing some extremely large
> > directory sizes in CephFS according to the recursive stats:
> >
> >
Wido,
I've been looking into this large omap objects problem on a couple of our
clusters today and came across your script during my research.
The script has been running for a few hours now and I'm already over 100,000
'orphaned' objects!
It appears that ever since upgrading to Luminous (12.2
Recently on one of our bigger clusters (~1,900 OSDs) running Luminous (12.2.8),
we had a problem where OSDs would frequently get restarted while deep-scrubbing.
After digging into it I found that a number of the OSDs had very large omap
directories (50GiB+). I believe these were OSDs that had p
f Zelenka
Date: Thursday, January 3, 2019 at 3:49 AM
To: "J. Eric Ivancich"
Cc: "ceph-users@lists.ceph.com" , Bryan Stillwell
Subject: Re: [ceph-users] Omap issues - metadata creating too many
Hi, i had the default - so it was on(according to ceph kb). turned it
off, but the iss
I have a cluster with over 1900 OSDs running Luminous (12.2.8) that isn't
cleaning up old osdmaps after doing an expansion. This is even after the
cluster became 100% active+clean:
# find /var/lib/ceph/osd/ceph-1754/current/meta -name 'osdmap*' | wc -l
46181
With the osdmaps being over 600KB i
I believe the option you're looking for is mon_data_size_warn. The default is
set to 16106127360.
I've found that sometimes the mons need a little help getting started with
trimming if you just completed a large expansion. Earlier today I had a
cluster where the mon's data directory was over
solution Dan came across back in the hammer days. It
works, but not ideal for sure. Across the cluster it freed up around 50TB of
data!
Bryan
From: ceph-users on behalf of Bryan
Stillwell
Date: Monday, January 7, 2019 at 2:40 PM
To: ceph-users
Subject: [ceph-users] osdmaps not being cleaned
#x27;re seeing up to 49,272 osdmaps hanging around. The churn trick seems to be
working again too.
Bryan
From: Dan van der Ster
Date: Thursday, January 10, 2019 at 3:13 AM
To: Bryan Stillwell
Cc: ceph-users
Subject: Re: [ceph-users] osdmaps not being cleaned up in 12.2.8
Hi Bryan,
I think th
I've created the following bug report to address this issue:
http://tracker.ceph.com/issues/37875
Bryan
From: ceph-users on behalf of Bryan
Stillwell
Date: Friday, January 11, 2019 at 8:59 AM
To: Dan van der Ster
Cc: ceph-users
Subject: Re: [ceph-users] osdmaps not being cleaned
I'm looking for some help in fixing a bucket index on a Luminous (12.2.8)
cluster running on FileStore.
First some background on how I believe the bucket index became broken. Last
month we had a PG in our .rgw.buckets.index pool become inconsistent:
2018-12-11 09:12:17.743983 osd.1879 osd.1879 1
This is sort of related to my email yesterday, but has anyone ever rebuilt a
bucket index using the objects themselves?
It seems to be that it would be possible since the bucket_id is contained
within the rados object name:
# rados -p .rgw.buckets.index listomapkeys .dir.default.56630221.139618
Since you're using jumbo frames, make sure everything between the nodes
properly supports them (nics & switches). I've tested this in the past by
using the size option in ping (you need to use a payload size of 8972 instead
of 9000 to account for the 28 byte header):
ping -s 8972 192.168.160.
When you use 3+2 EC that means you have 3 data chunks and 2 erasure chunks for
your data. So you can handle two failures, but not three. The min_size
setting is preventing you from going below 3 because that's the number of data
chunks you specified for the pool. I'm sorry to say this, but si
I've run my home cluster with drives ranging in size from 500GB to 8TB before
and the biggest issue you run into is that the bigger drives will get a
proportional more number of PGs which will increase the memory requirements on
them. Typically you want around 100 PGs/OSD, but if you mix 4TB an
I'm wondering if the 'radosgw-admin bucket check --fix' command is broken in
Luminous (12.2.8)?
I'm asking because I'm trying to reproduce a situation we have on one of our
production clusters and it doesn't seem to do anything. Here's the steps of my
test:
1. Create a bucket with 1 million o
We have two separate RGW clusters running Luminous (12.2.8) that have started
seeing an increase in PGs going active+clean+inconsistent with the reason being
caused by an omap_digest mismatch. Both clusters are using FileStore and the
inconsistent PGs are happening on the .rgw.buckets.index poo
> On Apr 8, 2019, at 4:38 PM, Gregory Farnum wrote:
>
> On Mon, Apr 8, 2019 at 3:19 PM Bryan Stillwell wrote:
>>
>> There doesn't appear to be any correlation between the OSDs which would
>> point to a hardware issue, and since it's happening on two di
> On Apr 8, 2019, at 5:42 PM, Bryan Stillwell wrote:
>
>
>> On Apr 8, 2019, at 4:38 PM, Gregory Farnum wrote:
>>
>> On Mon, Apr 8, 2019 at 3:19 PM Bryan Stillwell
>> wrote:
>>>
>>> There doesn't appear to be any correlation between
On Oct 29, 2019, at 11:23 AM, Jean-Philippe Méthot
wrote:
> A few months back, we had one of our OSD node motherboards die. At the time,
> we simply waited for recovery and purged the OSDs that were on the dead node.
> We just replaced that node and added back the drives as new OSDs. At the cep
Jelle,
Try putting just the WAL on the Optane NVMe. I'm guessing your DB is too big
to fit within 5GB. We used a 5GB journal on our nodes as well, but when we
switched to BlueStore (using ceph-volume lvm batch) it created 37GiB logical
volumes (200GB SSD / 5 or 400GB SSD / 10) for our DBs.
A
81 matches
Mail list logo