ribe send an email to ceph-users-le...@ceph.io
For help, read https://www.mrc-lmb.cam.ac.uk/scicomp/
then contact unixad...@mrc-lmb.cam.ac.uk
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue, Cambridge CB2 0QH, UK.
Phone 01223 267019 / Mobil
Dear All,
My apologies, I forgot to state we are using Quincy 17.2.6
thanks again,
Jake
root@wilma-s1 15:22 [~]: ceph -v
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
(stable)
Dear All,
we are trying to recover from what we suspect is a corrupt MDS :(
and have been
wanted
to get a feeling from others about how dangerous this could be?
We have a backup, but as there is 1.8PB of data, it's going to take a
few weeks to restore....
any ideas gratefully received.
Jake
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of
kernel driver in AlmaLinux 8.6, plus a recent version of Samba, together
with Quincy improve performance...
best regards
Jake
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
5 GiB 67 GiB1 KiB
914 MiB 16 TiB 2.14 1.01 99 up
thanks
Jake
On 20/07/2022 11:52, Jake Grimmett wrote:
Dear All,
We have just built a new cluster using Quincy 17.2.1
After copying ~25TB to the cluster (from a mimic cluster), we see 152 TB
used, which is ~6x disparity.
Is t
coded data pool (hdd with NVMe db/wal),
and a 3x replicated default data pool (primary_fs_data - NVMe)
bluestore_min_alloc_size_hdd is 4096
ceph pool set ec82pool compression_algorithm lz4
ceph osd pool set ec82pool compression_mode aggressive
many thanks for any help
Jake
--
Dr Jake Grimmett
bscribe send an email to ceph-users-le...@ceph.io
For help, read https://www.mrc-lmb.cam.ac.uk/scicomp/
then contact unixad...@mrc-lmb.cam.ac.uk
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
df
--- RAW STORAGE ---
CLASS SIZEAVAIL USED RAW USED %RAW USED
hdd7.0 PiB 6.9 PiB 126 TiB 126 TiB 1.75
ssd2.7 TiB 2.7 TiB 3.2 GiB 3.2 GiB 0.12
TOTAL 7.0 PiB 6.9 PiB 126 TiB 126 TiB 1.75
--- POOLS ---
POOL ID PGS STO
1.25 7200T 0.
1.01024 32 offFalse
Any ideas on what might be going on?
We get a similar problem if we specify hdd as the class.
best regards
Jake
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Fra
n/#confval-mon_osd_down_out_subtree_limit
<https://docs.ceph.com/en/latest/rados/configuration/mon-osd-interaction/#confval-mon_osd_down_out_subtree_limit>
The default is rack -- you want to set that to "host".
Cheers, Dan
On Fri., Feb. 18, 2022, 11:23 Jake Grimmett, <mailto:j...@mrc-lmb.cam.ac.uk>&g
look at turning the watchdog on, giving nagios an action, etc,
but I'd rather use any tools that ceph has built in.
BTW, this is an Octopus cluster 15.2.15, 580 x OSDs, using EC 8+2
best regards,
Jake
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
;s wizard.
If for some reason you can not or wish not to opt-it, please share the
reason with us.
Thanks,
Yaarit
On Thu, Jan 20, 2022 at 6:39 AM Jake Grimmett <mailto:j...@mrc-lmb.cam.ac.uk>> wrote:
Dear All,
Is the cloud option for the diskprediction module depreca
module useful?
many thanks
Jake
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users
ashboard1", so we could add a setting to customize that if required.
Kind Regards,
Ernesto
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
___
ceph-users mailing lis
ard1 in grafana?
The grafana install docs here:
https://docs.ceph.com/en/latest/mgr/dashboard/
State:
"Add Prometheus as data source to Grafana using the Grafana Web UI."
If the data source is now hard coded to "Dashboard1", can we update the
docs?
best regards,
Jake
-
rs-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Note: I am working from home until further notice.
For help, contact unixad...@mrc-lmb.cam.ac.uk
--
Dr Jake Grimmett
Head Of Scientif
that’s helpful.
>
> Sent from my iPad
>
>> On Sep 29, 2020, at 18:34, Jake Grimmett wrote:
>>
>> Hi Paul,
>>
>> I think you found the answer!
>>
>> When adding 100 new OSDs to the cluster, I increased both pg and pgp
>> from 4096 to 1
ou can check this by running "ceph osd pool ls detail" and check for
> the value of pg target.
>
> Also: Looks like you've set osd_scrub_during_recovery = false, this
> setting can be annoying on large erasure-coded setups on HDDs that see
> long recovery times. It's better to get IO priorities right; search
> mailing list for osd op queue cut off high.
>
> Paul
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
gt; On 2020-09-28 11:45, Jake Grimmett wrote:
>
>> To show the cluster before and immediately after an "episode"
>>
>> ***
>>
>> [root@ceph7 ceph]# ceph -s
>> cluster:
>> id: 36ed7113-08
nnel(cluster) log [DBG] :
5.157ds0 starting backfill to osd.469(7) from (0'0,0'0] MAX to
106803'6043528
2020-09-24 14:44:38.938 7f2e569e9700 0 log_channel(cluster) log [DBG] :
5.157ds0 starting backfill to osd.508(1) from (0'0,0'0] MAX to
106803'6043528
2020-09-24 14:44:38.947 7f2e569e9700 0 log_channel(clus
5.656 7f3cfe5f9700 0 mds.0.cache creating system
inode with ino:0x1
best regards,
Jake
On 29/04/2020 14:33, Jake Grimmett wrote:
> Dear all,
>
> After enabling "allow_standby_replay" on our cluster we are getting
> (lots) of identical errors on the client /var/log/messa
dby_replay ?
any advice appreciated,
many thanks
Jake
Note: I am working from home until further notice.
For help, contact unixad...@mrc-lmb.cam.ac.uk
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
Phone 01223 267
(clone_size.count(clone)) leaving us with a pg in a very bad
state...
I will see if we can buy some consulting time, the alternative is
several weeks of rsync.
Many thanks again for your advice, it's very much appreciated,
Jake
On 26/03/2020 17:21, Gregory Farnum wrote:
On Wed, Mar 25
regards,
Jake
On 25/03/2020 14:22, Eugen Block wrote:
Hi,
is there any chance to recover the other failing OSDs that seem to
have one chunk of this PG? Do the other OSDs fail with the same error?
Zitat von Jake Grimmett :
Dear All,
We are "in a bit of a pickle"...
No reply t
uot; or other advice gratefully received,
best regards,
Jake
Note: I am working from home until further notice.
For help, contact unixad...@mrc-lmb.cam.ac.uk
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
471'3200829 2020-01-28 15:48:35.574934
This cluster is being used to backup a live cephfs cluster and has 1.8PB
of data, including 30 days of snapshots. We are using 8+2 EC.
Any help appreciated,
Jake
Note: I am working from home until further notice.
For help, contact unixad...@m
his was possible and there
> was no suggestion to use a default replicated pool and than add the EC
> pool. We did exactly the oder way around :-/
>
> Best
> Dietmar
>
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
C
ile system at this time. Someday we would like
> to change this but there is no timeline.
>
--
Dr Jake Grimmett
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
___
ceph-users mailing list -- ceph-users@ceph.io
To u
43 --data /dev/sdab
activate the OSD
# ceph-volume lvm activate 443 6e252371-d158-4d16-ac31-fed8f7d0cb1f
Now watching to see if the cluster recovers...
best,
Jake
On 2/10/20 3:31 PM, Jake Grimmett wrote:
> Dear All,
>
> Following a clunky* cluster restart, we had
>
> 23 &
failing it's
primary OSD)
* thread describing the bad restart :>
<https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/IRKCDRRAH7YZEVXN5CH4JT2NH4EWYRGI/#IRKCDRRAH7YZEVXN5CH4JT2NH4EWYRGI>
many thanks!
Jake
--
Dr Jake Grimmett
MRC Laboratory of Molecular Biology
Francis Crick Av
ed various OSD restarts, deep-scrubs, with no change. I'm leaving
> things alone hoping that croit.io will update their package to 13.2.8
> soonish. Maybe that will help kick it in the pants.
>
> Chad.
> ___
> ceph-users mailing list --
pg:
[root@ceph1 ~]# ceph osd down 347
This doesn't change the output of "ceph pg 5.5c9 query", apart from
updating the Started time, and ceph health still shows unfound objects.
To fix this, do we need to issue a scrub (or deep scrub) so that the
objects
0'0",
"flags": "none",
"locations": [
"189(8)",
"263(9)"
]
}
],
"more": false
}
While it would be nice
33 matches
Mail list logo