The full ratio is based on the max bytes. if you say that the cache should
have a max bytes of 1TB and that the full ratio is .8, then it will aim to
keep it at 800GB. Without a max bytes value set, the ratios are a
percentage of unlimited... aka no limit themselves. The full_ratio should
be res
e
> of the current state of that work, but it would benefit all LIO targets
> when complete.
Zhu Lingshan (cc'ed) worked on a prototype for tcmu PR support. IIUC,
whether DLM or the underlying Ceph cluster gets used for PR state
storage is still under consideration.
Cheers, David
_
.com | *Skype flyersa* | LinkedIn View my Profile
> <https://www.linkedin.com/in/enricokern/>
>
>
>
> *Glispa GmbH* - Berlin Office
> Sonnenburger Str. 73 10437 Berlin, Germany
> Managing Director: David Brown, Registered in Berlin, AG Charlottenburg
> HRB 1
ite_aptpl_to_file() seems to be the only function that
uses the path. Otherwise I would have thought the same, that the
propagating the file to backup gateways prior to failover would be
sufficient.
Cheers, David
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Here is your friend.
http://docs.ceph.com/docs/luminous/rados/operations/erasure-code/#erasure-coding-with-overwrites
On Thu, Oct 12, 2017 at 2:09 PM Jason Dillaman wrote:
> The image metadata still needs to live in a replicated data pool --
> only the data blocks can be stored in an EC pool. Th
John covered everything better than I was going to, so I'll just remove
that from my reply.
If you aren't using DC SSDs and this is prod, then I wouldn't recommend
moving towards this model. However you are correct on how to move the pool
to the SSDs from the HDDs and based on how simple and quic
I don't have access to a luminous cluster at the moment, but I would try
looking in the pg dump first. You could also try the crush map.
Worst case scenario you could set up a bunch of test clients and attempt to
connect them to your cluster. You should be able to find which is the
oldest version
I improved the code to compute degraded objects during
backfill/recovery. During my testing it wouldn't result in percentage
above 100%. I'll have to look at the code and verify that some
subsequent changes didn't break things.
David
On 10/13/17 9:55 AM, Florian Haas w
What does your environment look like? Someone recently on the mailing list
had PGs stuck creating because of a networking issue.
On Fri, Oct 13, 2017 at 2:03 PM Ronny Aasen
wrote:
> strange that no osd is acting for your pg's
> can you show the output from
> ceph osd tree
>
>
> mvh
> Ronny Aase
Thanks all for input on this.
It’s taken a couple of weeks, but based on the feedback from the list,
we’ve got our version of a scrub-one-at-a-time cron script running and
confirmed that it’s working properly.
Unfortunately, this hasn’t really solved the real problem. Even with
just one scrub an
What is the output of your `ceph status`?
On Fri, Oct 13, 2017, 10:09 PM dE wrote:
> On 10/14/2017 12:53 AM, David Turner wrote:
>
> What does your environment look like? Someone recently on the mailing
> list had PGs stuck creating because of a networking issue.
>
> On Fri,
ts. This would not be the
first time I've seen a bonded network cause issues at least this bad on a
cluster. Do you have cluster_network and public_network set? What does your
network topology look like?
On Fri, Oct 13, 2017, 11:02 PM J David wrote:
> Thanks all for input on this.
>
>
On Sat, Oct 14, 2017 at 9:33 AM, David Turner wrote:
> First, there is no need to deep scrub your PGs every 2 days.
They aren’t being deep scrubbed every two days, nor is there any
attempt (or desire) to do so. That would be require 8+ scrubs running
at once. Currently, it takes between 2
ely starving
other normal or lower priority requests. Is that how it works? Or is
the queue in question a simple FIFO queue?
Is there anything else I can try to help narrow this down?
Thanks!
On Sat, Oct 14, 2017 at 6:51 PM, J David wrote:
> On Sat, Oct 14, 2017 at 9:33 AM, David Turner
I don't see that same_interval_since being cleared by split.
PG::split_into() copies the history from the parent PG to child. The
only code in Luminous that I see that clears it is in
ceph_objectstore_tool.cc
David
On 10/16/17 3:59 PM, Gregory Farnum wrote:
On Mon, Oct 16, 2017 at
I'm speaking to the method in general and don't know the specifics of
bluestore. Recovering from a failed journal in this way is only a good
idea if you were able to flush the journal before making a new one. If the
journal failed during operation and you couldn't cleanly flush the journal,
then
On Wed, Oct 18, 2017 at 8:12 AM, Ольга Ухина wrote:
> I have a problem with ceph luminous 12.2.1.
> […]
> I have slow requests on different OSDs on random time (for example at night,
> but I don’t see any problems at the time of problem
> […]
> 2017-10-18 01:20:38.187326 mon.st3 mon.0 10.192.1.78:
Have you ruled out the disk controller and backplane in the server running
slower?
On Thu, Oct 19, 2017 at 4:42 PM Russell Glaue wrote:
> I ran the test on the Ceph pool, and ran atop on all 4 storage servers, as
> suggested.
>
> Out of the 4 servers:
> 3 of them performed with 17% to 30% disk %
the
> disks slower.
> Is there a way I could test that theory, other than swapping out hardware?
> -RG
>
> On Thu, Oct 19, 2017 at 3:44 PM, David Turner
> wrote:
>
>> Have you ruled out the disk controller and backplane in the server
>> running slower?
>>
>&g
In a 3 node cluster with EC k=2 m=1, you can turn off one of the nodes and
the cluster will still operate normally. If you lose a disk during this
state or another server goes offline, then you lose access to your data.
But assuming that you bring up the third node and let it finish
backfilling/re
t 4:59 PM Jorge Pinilla López
wrote:
> Well I was trying it some days ago and it didn't work for me.
>
> maybe because of this:
>
> http://tracker.ceph.com/issues/18749
>
> https://github.com/ceph/ceph/pull/17619
>
> I don't know if now it's actually wor
How are you uploading a file? RGW, librados, CephFS, or RBD? There are
multiple reasons that the space might not be updating or cleaning itself
up. The more information you can give us about how you're testing, the
more we can help you.
On Thu, Oct 19, 2017 at 5:00 PM nigel davies wrote:
> Ha
at 5:03 PM Jorge Pinilla López
wrote:
> Yes, I am trying it over luminous.
>
> Well the bug has been going for 8 month and it hasn't been merged yet. Idk
> if that is whats preventing me to make it work. Tomorrow I will try to
> prove it again.
>
> El 19/10/2017 a l
On Thu, Oct 19, 2017 at 9:42 PM, Brad Hubbard wrote:
> I guess you have both read and followed
> http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/?highlight=backfill#debugging-slow-requests
>
> What was the result?
Not sure if you’re asking Ольга or myself, but in my cas
ther 3 servers.
> > >
> > > JC
> > >
> > > On Oct 19, 2017, at 13:49, Russell Glaue wrote:
> > >
> > > No, I have not ruled out the disk controller and backplane making the
> > > disks slower.
> > > Is there a way I could test tha
I don't know that you can disable snapshots. There isn't an automated
method in ceph to run snapshots, but you can easily script it. There are a
lot of different types of snapshots in ceph depending if you're using rbd,
rgw, or CephFS. There are also caveats and config options you should tweak
depe
Unless you manually issue a snapshot command on the pool, you will never
have a snapshot made. But again, I don't think you can disable it.
On Fri, Oct 20, 2017, 6:52 AM nigel davies wrote:
> ok i have set up an s3 bucket link to my ceph cluster so rgw,i only
> created my cluster yesterday.
>
>
If you add the external domain to the zonegroup's hostnames and endpoints,
then it will be able to respond to that domain. This is assuming that the
error message is that the URL is not a valid bucket. We ran into this issue
when we upgraded from 10.2.5 to 10.2.9. Any domain used to access RGW that
We recently deleted a bucket that was no longer needed that had 400TB of
data in it to help as our cluster is getting quite full. That should free
up about 30% of our cluster used space, but in the last week we haven't
seen nearly a fraction of that free up yet. I left the cluster with this
runni
purged_snaps is persistent indefinitely. If the list gets too large it
abbreviates it a bit, but it can cause your osd-map to get a fair bit
larger because it keeps track of them.
On Sun, Oct 22, 2017 at 10:39 PM Eric Eastman
wrote:
> On Sun, Oct 22, 2017 at 8:05 PM, Yan, Zheng wrote:
>
>> On
Multiple cached tiers? 2 tiers to 1 pool or a cache tier to a cache tier?
Neither are discussed or mentioned anywhere. At best it might work, but
isn't tested for a new release.
One cache to multiple pools? Same as above.
The luminous docs for cache tiering was updated with "A Word of Caution"
wh
This can be changed to a failure domain of OSD in which case it could
satisfy the criteria. The problem with a failure domain of OSD, is that
all of your data could reside on a single host and you could lose access to
your data after restarting a single host.
On Mon, Oct 23, 2017 at 3:23 PM LOPEZ
on what to
do to optimize the gc settings to hopefully/eventually catch up on this as
well as stay caught up once we are? I'm not expecting an overnight fix,
but something that could feasibly be caught up within 6 months would be
wonderful.
On Mon, Oct 23, 2017 at 11:18 AM David Turner wrote:
work time = 00:01-23:59
> rgw gc max objs = 2647
> rgw lc max objs = 2647
> rgw gc obj min wait = 300
> rgw gc processor period = 600
> rgw gc processor max time = 600
>
>
> -Ben
>
> On Tue, Oct 24, 2017 at 9:25 AM, David Turner
> wrote:
>
>> As I'
Are you talking about RGW buckets with limited permissions for cephx
authentication? Or RGW buckets with limited permissions for RGW users?
On Wed, Oct 25, 2017 at 12:16 PM nigel davies wrote:
> Hay All
>
> is it possible to set permissions to buckets
>
> for example if i have 2 users (user_a a
I had the exact same error when using --bypass-gc. We too decided to
destroy this realm and start it fresh. For us, 95% of the data in this
realm is backups for other systems and they're find rebuilding it. So our
plan is to migrate the 5% of the data to a temporary s3 location and then
rebuild
rbd-nbd is gaining a lot of followers for use as mapping rbds. The kernel
driver for RBD's has taken a while to support features of current ceph
versions. The nice thing with rbd-nbd is that it has feature parity with
the version of ceph you are using and can enable all of the rbd features
you wa
What does your crush map look like? Also a `ceph df` output. You're
optimizing your map for pool #5, if there are other pools with a
significant amount of data, then your going to be off on your cluster
balance.
A big question for balancing a cluster is how big are your PGs? If your
primary dat
Your client needs to tell the cluster that the objects have been deleted.
'-o discard' is my goto because I'm lazy and it works well enough for me.
If you're in need of more performance, then fstrim is the other option.
Nothing on the Ceph side can be configured to know when a client no longer
need
If you can do an ssh session to the IPMI console and then do that inside of
a screen, you can save the output of the screen to a file and look at what
was happening on the console when the server locked up. That's how I track
kernel panics.
On Fri, Oct 27, 2017 at 1:53 PM Bogdan SOLGA wrote:
>
Saying Ubuntu doesn't have a place on servers negates your assertion that
the OS is a tool and you should use the right tool for the right job.
Sometimes you need an OS that updates its kernel more often than basically
never. Back when VMs were gaining traction and CentOS 6 was running the 2.6
kern
st of needing to know all of them or to retrain them.
If Ubuntu wasn't stable and secure, it wouldn't be popular. It may not be
the most stable or secure, but it sure does get new features faster.
On Sat, Oct 28, 2017, 1:01 PM David Turner wrote:
> Saying Ubuntu doesn't
What is your min_size in the cache pool? If your min_size is 2, then the
cluster would block requests to that pool due to it having too few copies
available.
PS - Please don't consider using rep_size 2 in production.
On Wed, Nov 1, 2017 at 5:14 AM Eugen Block wrote:
> Hi experts,
>
> we have u
PPS - or min_size 1 in production
On Wed, Nov 1, 2017 at 10:08 AM David Turner wrote:
> What is your min_size in the cache pool? If your min_size is 2, then the
> cluster would block requests to that pool due to it having too few copies
> available.
>
> PS - Please don
It looks like you're running with a size = 2 and min_size = 1 (the min_size
is a guess, the size is based on how many osds belong to your problem
PGs). Here's some good reading for you.
https://www.spinics.net/lists/ceph-users/msg32895.html
Basically the jist is that when running with size = 2 yo
proxy in front of the object gateway)
Is there some other way to achieve my goal?
Thanks in advance,
--
David Watzke
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
important thing is that even with min_size=1 writes are
> acknowledged after ceph wrote size=2 copies.
> In the thread above there is:
>
> As David already said, when all OSDs are up and in for a PG Ceph will wait
> for ALL OSDs to Ack the write. Writes in RADOS are always synchrono
Jon,
If you are able please test my tentative fix for this issue which
is in https://github.com/ceph/ceph/pull/18673
Thanks
David
On 10/30/17 1:13 AM, Jon Light wrote:
Hello,
I have three OSDs that are crashing on start with a FAILED
assert(p.same_interval_since) error. I ran
gt;> -osd 2 comes back (therefore we have a clean osd in the acting group) ...
>> backfill would continue to osd 1 of course
>> -or data in pg "A" is manually marked as lost, and then continues
>> operation from osd 1 's (outdated) copy?
>>
>
> It does
e were replacing a dozen disks
> weekly.
>
> On the flip side shutting down client access because of a disk failure in
> the cluster is *unacceptable* to a product
>
> On Wed, Nov 1, 2017 at 10:08 AM, David Turner
> wrote:
>
>> PPS - or min_size 1 in production
>&g
...
>> backfill would continue to osd 1 of course
>> -or data in pg "A" is manually marked as lost, and then continues
>> operation from osd 1 's (outdated) copy?
>>
>
> It does deny IO in that case. I think David was pointing out that if OSD 2
> is a
Jewel 10.2.7; XFS formatted OSDs; no dmcrypt or LVM. I have a pool that I
deleted 16 hours ago that accounted for about 70% of the available space on
each OSD (averaging 84% full), 370M objects in 8k PGs, ec 4+2 profile.
Based on the rate that the OSDs are freeing up space after deleting the
pool,
The Ceph docs are versioned. The link you used is for jewel. Change the
jewel in the url to luminous to look at the luminous version of the docs.
That said, the documentation regarding RAM recommendations has not changed,
but this topic was covered fairly recently on the ML. Here is a link to
t
Has anyone developed a bot that can be used in slack to run a few commands
against a ceph cluster. I'm thinking about something that could run some
read-only commands like `ceph status`.
If not, I will be glad to start some work on it. But I figured that I may
not be the only person out there that
If you don't mind juggling multiple access/secret keys, you can use
subusers. Just have 1 user per bucket and create subusers with read,
write, etc permissions. The objects are all owned by the 1 user that
created the bucket, and then you pass around the subuser keys to the
various apps that need
n upgrade to Luminous and take advantage of the newer features for rgw.
On Mon, Nov 6, 2017 at 11:54 AM nigel davies wrote:
> Thanks all
>
> David if you can explain how to create subusers with keys i happy to try
> and explain to my boss.
>
> The issue i had with the ACLs, for some
ere
and must be different. I would name bucket_a and user_a the same name for
simplicity so it's obvious which user owns which bucket.
On Tue, Nov 7, 2017, 5:25 AM nigel davies wrote:
> Thanks David and All
>
> I am trying out what you said now.
>
> When talking to my manager about
Jewel 10.2.7. I have a realm that is not replicating data unless I restart
the RGW daemons. It will catch up when I restart the daemon, but then not
replicate new information until it's restarted again. This is the only
realm with this problem, but all of the realms are configured identically.
D
What's the output of `ceph df` to see if your PG counts are good or not?
Like everyone else has said, the space on the original osds can't be
expected to free up until the backfill from adding the new osd has finished.
You don't have anything in your cluster health to indicate that your
cluster wi
lt-in reweighting scripts might help your data distribution.
reweight-by-utilization
On Sun, Nov 12, 2017, 11:41 AM gjprabu wrote:
> Hi David,
>
> Thanks for your valuable reply , once complete the backfilling for new osd
> and will consider by increasing replica value asap. Is it possi
The only raid I would consider using for a ceph osd is raid 0. Ceph deals
with the redundancy very nicely and you won't get the impact of running on
a parity raid. I wouldn't suggest doing all 8 drives in a node in a single
raid 0, but you could reduce your osd count in half by doing 4x 2 drive
r
I'm assuming you've looked at the period in both places `radosgw-admin
period get` and confirmed that the second site is behind the master site
(based on epochs). I'm also assuming (since you linked the instructions)
that you've done `radosgw-admin period pull` on the second site to get any
period
to go faster.
>
> Don’t just remove the directory in the filesystem; you need to clean up
> the leveldb metadata as well. ;)
> Removing the pg via Ceph-objectstore-tool would work fine but I’ve seen
> too many people kill the wrong thing to recommend it.
> -Greg
> On Thu, Nov 2, 2
How big was your blocks.db partition for each OSD and what size are your
HDDs? Also how full is your cluster? It's possible that your blocks.db
partition wasn't large enough to hold the entire db and it had to spill
over onto the HDD which would definitely impact performance.
On Tue, Nov 14, 201
If you know that the pool should be empty, there wouldn't be a problem with
piping the ouput of `rados ls` to `rados rm`. By the same notion, if
nothing in the pool is needed you can delete the pool and create a new one
that will be perfectly empty.
On Tue, Nov 14, 2017 at 3:23 PM Karun Josy wro
While you can configure 1 pool to be used for RBD and Object storage, I
believe that is being deprecated and can cause unforeseen problems in the
future. It is definitely not a recommended or common use case.
On Tue, Nov 14, 2017 at 4:51 PM Christian Wuerdig <
christian.wuer...@gmail.com> wrote:
3203G 0.1731147G 66486
>
> kumo-vms3 11 45824M 0.0431147G 11643
>
> kumo-volumes3 13 10837M 031147G2724
>
> kumo-images3 15 82450M 0.0931147G 10320
>
&g
I’ll try 60GB db
> partition – this is the max OSD capacity.
>
>
>
> - Rado
>
>
>
> *From:* David Turner [mailto:drakonst...@gmail.com]
> *Sent:* Tuesday, November 14, 2017 5:38 PM
>
>
> *To:* Milanov, Radoslav Nikiforov
>
> *Cc:* Mark Nelson ; ceph-users
I'm not going to lie. This makes me dislike Bluestore quite a bit. Using
multiple OSDs to an SSD journal allowed for you to monitor the write
durability of the SSD and replace it without having to out and re-add all
of the OSDs on the device. Having to now out and backfill back onto the
HDDs is
It's probably against the inner workings of Ceph to change the ID of the
pool. There are a couple other things in Ceph that keep old data around
most likely to prevent potential collisions. One in particular is keeping
deleted_snaps in the OSD map indefinitely.
One thing I can think of in partic
The first step is to make sure that it is out of the cluster. Does `ceph
osd stat` show the same number of OSDs as in (it's the same as a line from
`ceph status`)? It should show 1 less for up, but if it's still
registering the OSD as in then the backfilling won't start. `ceph osd out
0` should
There is another thread in the ML right now covering this exact topic. The
general consensus is that for most deployments, a separate network for
public and cluster is wasted complexity.
On Thu, Nov 16, 2017 at 9:59 AM Jake Young wrote:
> On Wed, Nov 15, 2017 at 1:07 PM Ronny Aasen
> wrote:
>
That depends on another question. Does the client write all 3 copies or
does the client send the copy to the primary OSD and then the primary OSD
sends the write to the secondaries? Someone asked this recently, but I
don't recall if an answer was given. I'm not actually certain which is the
case
Another ML thread currently happening is "[ceph-users] Cluster network
slower than public network" And It has some good information that might be
useful for you.
On Thu, Nov 16, 2017 at 10:32 AM David Turner wrote:
> That depends on another question. Does the client write al
The filestore_split_multiple command does indeed need a restart of the OSD
daemon to take effect. Same with the filestore_merge_threshold. These
settings also only affect filestore. If you're using bluestore, then they
don't mean anything.
You can utilize the ceph-objectstore-tool to split subf
Does letting the cluster run with noup for a while until all down disks are
idle, and then letting them come in help at all? I don't know your
specific issue and haven't touched bluestore yet, but that is generally
sound advice when is won't start.
Also is there any pattern to the osds that are d
7494) [0x7fb45cab4494]
>
> 17: (clone()+0x3f) [0x7fb45bb3baff]
>
> NOTE: a copy of the executable, or `objdump -rdS ` is needed
> to interpret this.
>
>
>
> I guess even with noup the OSD/PG still has the peer with the other PG’s
> which is the stage that causes the fa
This topic has been discussed in detail multiple times and from various
angles. Your key points are going to be CPU limits iops, dwpd, iops vs
bandwidth, and SSD clusters/pools in general. You should be able to find
everything you need in the archives.
On Mon, Nov 20, 2017, 12:56 AM M Ranga Swami
I created a bug tracker for this here. http://tracker.ceph.com/issues/22201
Thank you for your help Gregory.
On Sat, Nov 18, 2017 at 9:20 PM Gregory Farnum wrote:
> On Wed, Nov 15, 2017 at 6:50 AM David Turner
> wrote:
>
>> 2 weeks later and things are still deleting, but getti
What is your current `ceph status` and `ceph df`? The status of your
cluster has likely changed a bit in the last week.
On Mon, Nov 20, 2017 at 6:00 AM gjprabu wrote:
> Hi David,
>
> Sorry for the late reply and its completed OSD Sync and more
> ever still fourth OSD av
ght and/or reweight of the osd to help the algorithm
balance that out.
On Tue, Nov 21, 2017 at 12:11 AM gjprabu wrote:
> Hi David,
>
>This is our current status.
>
>
> ~]# ceph status
> cluster b466e09c-f7ae-4e89-99a7-99d30eba0a13
> health HEALTH_WARN
&g
All you have to do is figure out why osd.0, osd.1, and osd.2 are down and
get the daemons running. They have PGs assigned to them, but since they
are not up and running those PGs are in a down state. You can check the
logs for them in /var/log/ceph/. Did you have any errors when deploying
these
User and bucket operations have more to do with what is providing the S3
API. In this case you're using swift for that. The Ceph tools to do this
would be if you're using RGW to provide the S3 API.
The answers you're looking for would be in how to do this with SWIFT, if
I'm not mistaken. Ceph i
Yes, increasing the PG count for the data pool will be what you want to do
when you add osds to your cluster.
On Wed, Nov 22, 2017, 9:25 AM gjprabu wrote:
> Hi David,
>
> Thanks, will check osd weight settings and we are not using rbd
> and will delete. As per the pg cal
If you create a subuser of the uid, then the subuser can have its own name
and key while being the same user. You can also limit a subuser to read,
write, read+write, or full permissions. Full is identical permissions for
the subuser as the user. Full enables creating and deleting buckets.
To list
e?
>
> Best Regards,
>
> 2017-11-23 9:55 GMT-02:00 Abhishek :
>
>> On 2017-11-23 12:41, Daniel Picolli Biazus wrote:
>>
>>> Hey David,
>>>
>>
>> You can create multiple keys using key create command
>>
>> radosgw-admin ke
quot;: 187280168,
>
> "traverse_hit": 185739606,
>
> "traverse_forward": 0,
>
> "traverse_discover": 0,
>
> "traverse_dir_fetch": 118150,
>
> "traverse_remote_ino": 8,
>
> "traverse_lock
provision 2GB and haven’t experienced any issues with that. You
also probably will need to adjust the ratios, but that was covered in other
threads previously.
David Byte
Sr. Technical Strategist
IHV Alliances and Embedded
SUSE
Sent from my iPhone. Typos are Apple's fault.
On Nov 23, 2017, at 3:
An admin node does not have any bearing on the running of the cluster.
Usually they're helpful for centralized monitoring, deploying, and
management... But none of that involves a service needed by the cluster or
information any daemon in the cluster needs.
On Thu, Nov 23, 2017, 1:08 PM Karun Josy
If you are too a point where you need to repair the xfs partition, you
should probably just rebuild the osd and backfill back onto it as a fresh
osd. That's even more true now that the repair had bad side effects.
On Sat, Nov 25, 2017, 11:33 AM Hauke Homburg
wrote:
> Hello List,
>
> Yesterday i
Disclaimer... This is slightly off topic and a genuine question. I am a
container noobie that has only used them for test environments for nginx
configs and ceph client multi-tenency benchmarking.
I understand the benefits to containerizing RGW, MDS, and MGR daemons. I
can even come up with a de
Yep, that did it! Thanks, Zheng. I should read release notes more carefully!
On Fri, Nov 24, 2017 at 7:09 AM, Yan, Zheng wrote:
> On Thu, Nov 23, 2017 at 9:17 PM, David C wrote:
> > Hi All
> >
> > I upgraded my 12.2.0 cluster to 12.2.1 a month or two back. I've noti
Hi Jens
We also see these messages quite frequently, mainly the "replicating
dir...". Only seen "failed to open ino" a few times so didn't do any real
investigation. Our set up is very similar to yours, 12.2.1, active/standby
MDS and exporting cephfs through KNFS (hoping to replace with Ganesha
so
On 27 Nov 2017 1:06 p.m., "Jens-U. Mozdzen" wrote:
Hi David,
Zitat von David C :
Hi Jens
>
> We also see these messages quite frequently, mainly the "replicating
> dir...". Only seen "failed to open ino" a few times so didn't do any real
> inv
all NVMe, environment.
David Byte
Sr. Technology Strategist
SCE Enterprise Linux
SCE Enterprise Storage
Alliances and SUSE Embedded
db...@suse.com
918.528.4422
From: ceph-users on behalf of German Anders
Date: Monday, November 27, 2017 at 8:44 AM
To: Maged Mokhtar
Cc: ceph-users
Subject: Re
te the OSD. If people needing storage are the user, then
they get to use an RBD, CephFS, object pool, or S3 bucket... but none of
that explains why putting the OSD into a container gives any benefit.
On Mon, Nov 27, 2017 at 1:09 AM Dai Xiang wrote:
> On Mon, Nov 27, 2017 at 03:10:09AM +,
Your EC profile requires 5 servers to be healthy. When you remove 1 OSD
from the cluster, it recovers by moving all of the copies on that OSD to
other OSDs in the same host. However when you remove an entire host, it
cannot store 5 copies of the data on the 4 remaining servers with your
crush rul
Isn't marking something as deprecated meaning that there is a better option
that we want you to use and you should switch to it sooner than later? I
don't understand how this is ready to be marked as such if ceph-volume
can't be switched to for all supported use cases. If ZFS, encryption,
FreeBSD,
I personally set max_scrubs to 0 on the cluster and then set it to 1 only
on the osds involved in the PG you want to scrub. Setting the cluster to
max_scrubs of 1 and then upping the involved osds to 2 might help, but is
not a guarantee.
On Tue, Nov 28, 2017 at 7:25 PM Gregory Farnum wrote:
> O
On Tue, Nov 28, 2017 at 1:50 PM, Jens-U. Mozdzen wrote:
> Hi David,
>
> Zitat von David C :
>
>> On 27 Nov 2017 1:06 p.m., "Jens-U. Mozdzen" wrote:
>>
>> Hi David,
>>
>> Zitat von David C :
>>
>> Hi Jens
>>
>>>
&
501 - 600 of 1516 matches
Mail list logo