Hi,
I have this error:
I have 36 osd and get this:
Error ERANGE: pg_num 4096 size 6 would mean 25011 total pgs, which exceeds max
10500 (mon_max_pg_per_osd 250 * num_in_osds 42)
If I want to calculate the max pg in my server, how it works if I have EC pool?
I have 4:2 data EC pool, and the oth
Vaughan;
An absolute minimal Ceph cluster really needs to be 3 servers, and at that
usable space should be 1/3 of raw space (see the archives of this mailing list
for many discussions of why size=2 is bad).
While it is possible to run other tasks on Ceph servers, memory utilization of
Ceph pro
Hi All,
I'm not sure if this is the correct place to ask this question, I have tried
the channels, but have received very little help there.
I am currently very new to Ceph and am investigating it as to a possible
replacement for a legacy application which use to provide us with replication.
A
Hi all!
Hopefully some of you can shed some light on this. We have big problems with
samba crashing when macOS smb clients access certain/random folders/files over
vfs_ceph.
When browsing cephfs folder in question directly on a cephnode where cephfs is
mouted we experience some issues like slo
Addition: This happens only when I stop mon.ceph-01, I can stop any other MON
daemon without problems. I checked network connectivity and all hosts can see
all other hosts.
I already increased mon_mgr_beacon_grace to a huge value due to another bug a
long time ago:
global advanced mon_mgr_
Dear cephers,
I have a problem with MGR daemons, ceph version mimic-13.2.8. I'm trying to do
maintenance on our MON/MGR servers and am through with 2 out of 3. I have MON
and MGR collocated on a host, 3 hosts in total. So far, procedure was to stop
the deamons on the server and do the maintenan
>
> I'm probably going to get crucified for this
Naw. The <> in your From: header, though ….
;)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Phil;
I'm probably going to get crucified for this, but I put a year of testing into
this before determining it was sufficient to the needs of my organization...
If the primary concerns are capability and cost (not top of the line
performance), then I can tell you that we have had great success
I am not sure any configuration tuning would help here.
The bottleneck is on HDD. In my case, I have a SSD for
WAL/DB and it provides pretty good write performance.
The part I don't quite understand in your case is that,
random read is quite fast. Due to the HDD seeking latency,
the random read is
Also when I reclassify-bucket to a non exist base bucket it says: "default
parent test does not exist"
But as documented in
https://docs.ceph.com/en/latest/rados/operations/crush-map-edits/ it should
create it!
On Tue, Nov 17, 2020 at 6:05 PM Seena Fallah wrote:
> Hi all,
>
> I want to reclassif
Hello all.
I'm trying to deploy the dashboard (Nautilus 14.2.8), and after I run ceph
dashboard create-self-signed-cert, the cluster started to show this warning:
# ceph health detail
HEALTH_ERR Module 'dashboard' has failed: '_cffi_backend.CDataGCP' object
has no attribute 'type'
MGR_MODULE_ERROR
I disabled the CephX authentication now. Though the Performance is Slightly
better, it is not yet there.
Are there any other recommendations for all HDD ceph clusters ?
From another thread
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/DFHXXN4KKI5PS7LYPZJO4GYHU67JYVVL/
*In our
Hi Krasaev,
Thanks for pointing out this issue! This is currently under review here:
[1], and tracked here: [2].
Once merged, the fix would be available on the master development branch,
and the plan is to backport the fix to Octopus in the future.
Yuval
[1] https://github.com/ceph/ceph/pull/3813
Hi all,
I want to reclassify my crushmap. I have two roots, one hiops and one
default. In hiops root I have one datacenter and in that I have three rack
and in each rack I have 3 osds. When I run the command below it says "item
-55 in bucket -54 is not also a reclassified bucket". I see the new
cr
Hi,
> I don't think the default osd_min_pg_log_entries has changed recently.
> In https://tracker.ceph.com/issues/47775 I proposed that we limit the
> pg log length by memory -- if it is indeed possible for log entries to
> get into several MB, then this would be necessary IMHO.
I've had a surpri
Hello,
We are running a Octopus cluster however we still have some older Ubuntu
16.04 clients connecting using libcephfs2 version 14.2.13-1xenial.
From time to time it happened that the network was having issues so the
clients lost the connection to the cluster.
But the system still thinks th
Hi Phil,
thanks for the background info.
Am 17.11.20 um 01:51 schrieb Phil Merricks:
> 1: Move off the data and scrap the cluster as it stands currently.
> (already under way)
> 2: Group the block devices into pools of the same geometry and type (and
> maybe do some tiering?)
> 3. Spread the O
Hi Dan,
I 100% agree with your proposal. One of the goals I had in mind with
the prioritycache framework is that pglog could end up becoming another
prioritycache target that is balanced against the other caches. The
idea would be that we try to keep some amount of pglog data in memory at
I don't think the default osd_min_pg_log_entries has changed recently.
In https://tracker.ceph.com/issues/47775 I proposed that we limit the
pg log length by memory -- if it is indeed possible for log entries to
get into several MB, then this would be necessary IMHO.
But you said you were trimming
Another idea, which I don't know if has any merit.
If 8 MB is a realistic log size (or has this grown for some reason?), did the
enforcement (or default) of the minimum value change lately
(osd_min_pg_log_entries)?
If the minimum amount would be set to 1000, at 8 MB per log, we would have
iss
On Tue, Nov 17, 2020 at 11:45 AM Kalle Happonen wrote:
>
> Hi Dan @ co.,
> Thanks for the support (moral and technical).
>
> That sounds like a good guess, but it seems like there is nothing alarming
> here. In all our pools, some pgs are a bit over 3100, but not at any
> exceptional values.
>
>
Hi Dan @ co.,
Thanks for the support (moral and technical).
That sounds like a good guess, but it seems like there is nothing alarming
here. In all our pools, some pgs are a bit over 3100, but not at any
exceptional values.
cat pgdumpfull.txt | jq '.pg_map.pg_stats[] |
select(.ondisk_log_size >
Hi Kalle,
Do you have active PGs now with huge pglogs?
You can do something like this to find them:
ceph pg dump -f json | jq '.pg_map.pg_stats[] |
select(.ondisk_log_size > 3000)'
If you find some, could you increase to debug_osd = 10 then share the osd log.
I am interested in the debug line
Hi Xie,
On Tue, Nov 17, 2020 at 11:14 AM wrote:
>
> Hi Dan,
>
>
> > Given that it adds a case where the pg_log is not trimmed, I wonder if
> > there could be an unforeseen condition where `last_update_ondisk`
> > isn't being updated correctly, and therefore the osd stops trimming
> > the pg_log a
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi Kalle,
Strangely and luckily, in our case the memory explosion didn't reoccur
after that incident. So I can mostly only offer moral support.
But if this bug indeed appeared between 14.2.8 and 14.2.13, then I
think this is suspicious:
b670715eb4 osd/PeeringState: do not trim pg log past las
Hello,
currently, we are experiencing problems with a cluster used for storing
RBD backups. Config:
* 8 nodes, each with 6 HDDs OSDs and 1 SSD used for blockdb and WAL
* k=4 m=2 EC
* dual 25GbE NIC
* v14.2.8
ceph health detail shows the following messages:
HEALTH_WARN BlueFS spillover detected
Hello all,
wrt:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7IMIWCKIHXNULEBHVUIXQQGYUDJAO2SF/
Yesterday we hit a problem with osd_pglog memory, similar to the thread above.
We have a 56 node object storage (S3+SWIFT) cluster with 25 OSD disk per node.
We run 8+3 EC for the d
I have run radosgw-admin gc list (without --include-all) a few times
already, but the list was always empty. I will create a cron job running
it every few minutes and writing out the results.
On 17/11/2020 02:22, Eric Ivancich wrote:
I’m wondering if anyone experiencing this bug would mind runn
29 matches
Mail list logo