Re: [ceph-users] after reboot node appear outside the root root tree

2017-09-13 Thread dE
On 09/13/2017 09:08 PM, German Anders wrote: Hi cephers, I'm having an issue with a newly created cluster 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc). Basically when I reboot one of the nodes, and when it come back, it come outside of the root type on the tree: root@cpm0

[ceph-users] What's 'failsafe full'

2017-09-13 Thread dE
Hello everyone, Just started with Ceph here. I was reading the documentation here -- http://docs.ceph.com/docs/master/rados/operations/health-checks/#osd-out-of-order-full And just started to wonder what's failsafe_full... I know it's some kind of ratio, but how do I change it? I didn't f

[ceph-users] unknown PG state in a newly created pool.

2017-09-13 Thread dE .
Hi, In my test cluster where I've just 1 OSD which's up and in -- 1 osds: 1 up, 1 in I create a pool with size 1 and min_size 1 and PG of 1, or 2 or 3 or any no. However I cannot write objects to the cluster. The PGs are stuck in an unknown state -- ceph -c /etc/ceph/cluster.conf health deta

Re: [ceph-users] unknown PG state in a newly created pool.

2017-09-13 Thread dE .
Ok, removed this line and it got fixed -- crush location = "region=XX datacenter= room= row=N rack=N chassis=N" But why will it matter? On Thu, Sep 14, 2017 at 12:11 PM, dE . wrote: > Hi, > In my test cluster where I've just 1 OSD which's up and in -- &g

[ceph-users] OSD_OUT_OF_ORDER_FULL even when the ratios are in order.

2017-09-14 Thread dE .
Hi, I got a ceph cluster where I'm getting a OSD_OUT_OF_ORDER_FULL health error, even though it appears that it is in order -- full_ratio 0.99 backfillfull_ratio 0.97 nearfull_ratio 0.98 These don't seem like a mistake to me but ceph is complaining -- OSD_OUT_OF_ORDER_FULL full ratio(s) out o

Re: [ceph-users] OSD_OUT_OF_ORDER_FULL even when the ratios are in order.

2017-09-14 Thread dE .
Aasen wrote: > On 14. sep. 2017 11:58, dE . wrote: > >> Hi, >> I got a ceph cluster where I'm getting a OSD_OUT_OF_ORDER_FULL >> health error, even though it appears that it is in order -- >> >> full_ratio 0.99 >> backfillfull_ratio 0.97 >&

Re: [ceph-users] OSD_OUT_OF_ORDER_FULL even when the ratios are in order.

2017-09-14 Thread dE .
ou > too close to your full_ratio that you are in a high danger of blocking all > IO to your cluster. > > Even if you stick with the defaults you're in a good enough situation > where you will be most likely able to recover from most failures in your > cluster. But don't

[ceph-users] 'flags' of PG.

2017-09-15 Thread dE .
Hi, I was going through health check documentation, where I found references to 'PG flags' like degraded, undersized, backfill_toofull or recovery_toofull etc... I find traces of these flags throughout the documentation, but but

[ceph-users] Brand new cluster -- pg is stuck inactive

2017-10-13 Thread dE
Hi,     I'm running ceph 10.2.5 on Debian (official package). It cant seem to create any functional pools -- ceph health detail HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive; too few PGs per OSD (21 < min 30) pg 0.39 is stuck inactive for 652.741684, cur

Re: [ceph-users] Brand new cluster -- pg is stuck inactive

2017-10-13 Thread dE
On 10/13/2017 10:23 PM, dE wrote: Hi,     I'm running ceph 10.2.5 on Debian (official package). It cant seem to create any functional pools -- ceph health detail HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive; too few PGs per OSD (21 < min 30)

Re: [ceph-users] Brand new cluster -- pg is stuck inactive

2017-10-13 Thread dE
On 10/13/2017 10:23 PM, dE wrote: Hi,     I'm running ceph 10.2.5 on Debian (official package). It cant seem to create any functional pools -- ceph health detail HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive; too few PGs per OSD (21 < min 30)

Re: [ceph-users] Brand new cluster -- pg is stuck inactive

2017-10-13 Thread dE
no osd is acting for your pg's can you show the output from ceph osd tree mvh Ronny Aasen On 13.10.2017 18:53, dE wrote: > Hi, > >     I'm running ceph 10.2.5 on Debian (official package). > > It cant seem to create any functiona

Re: [ceph-users] Brand new cluster -- pg is stuck inactive

2017-10-14 Thread dE
cluster. On Sat, Oct 14, 2017, 10:01 AM dE . <mailto:de.tec...@gmail.com>> wrote: I attached 1TB disks to each osd. cluster 8161c90e-dbd2-4491-acf8-74449bef916a health HEALTH_ERR     clock skew detected on mon.1, mon.2     64 pgs are stuck in

Re: [ceph-users] Brand new cluster -- pg is stuck inactive

2017-10-14 Thread dE
, Denes. On 10/14/2017 07:39 PM, dE wrote: On 10/14/2017 08:18 PM, David Turner wrote: What are the ownership permissions on your osd folders? Clock skew cares about partial seconds. It isn't the networking issue because your cluster isn't stuck peering. I'm not sure if the

Re: [ceph-users] qemu packages for el7

2014-06-18 Thread Stijn De Weirdt
hi kenneth, yes there are if you create them. from the centos git sources wiki, use the qemu-kvm repo, in the spec set rhev to 1 (and change the release) and build it. update the installed rpms and done. works out of the box (but maybe not so much from your side of our office ;) stijn On 06

[ceph-users] ceph kernel module for centos7

2014-06-27 Thread Stijn De Weirdt
hi all, does anyone know how to build the ceph.ko module for centos7 (3.10.0-123.el7 kernel) QA release? rebuilding the 0.81 ceph-kmod src rpms gives modules for libceph and rbd, but the one for ceph fails with error (same issue as for rhel7rc, see https://github.com/ceph/ceph-kmod-rpm/issues

[ceph-users] uWSGI native rados plugin

2014-06-28 Thread Roberto De Ioris
ystem (for triggering events on failed rados pings and so on) Hope it will be useful for someone Every report will be wellcomed. Thanks -- Roberto De Ioris http://unbit.it ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/

[ceph-users] cephfs and EC

2014-07-08 Thread Stijn De Weirdt
hi all, one of the changes in the 0.82 release (accoridng to the notes) is: mon: prevent EC pools from being used with cephfs can someone clarify this a bit? cephfs with EC pools make no sense? now? ever? or is it just not recommended (i'm also interested in the technical reasons behind it)

Re: [ceph-users] cephfs and EC

2014-07-08 Thread Stijn De Weirdt
hi mark, thanks for clarifying it a bit. we'll certainly have a look at the caching tier setup. stijn On 07/08/2014 01:53 PM, Mark Nelson wrote: On 07/08/2014 04:28 AM, Stijn De Weirdt wrote: hi all, one of the changes in the 0.82 release (accoridng to the notes) is: mon: preve

Re: [ceph-users] ceph kernel module for centos7

2014-07-08 Thread Stijn De Weirdt
e that the kmod-ceph src rpms also provide the ceph module. stijn On 06/27/2014 06:26 PM, Stijn De Weirdt wrote: hi all, does anyone know how to build the ceph.ko module for centos7 (3.10.0-123.el7 kernel) QA release? rebuilding the 0.81 ceph-kmod src rpms gives modules for libceph and rbd, b

Re: [ceph-users] cephfs and EC

2014-07-09 Thread Stijn De Weirdt
cache pool fails). but probably the read-cache has to be the same as the write cache (eg when people want to modify a file). stijn On 07/08/2014 05:24 PM, Stijn De Weirdt wrote: hi mark, thanks for clarifying it a bit. we'll certainly have a look at the caching tier setup. stijn On 07/

Re: [ceph-users] v0.84 released

2014-08-26 Thread Stijn De Weirdt
hi all, there are a zillion OSD bug fixes. Things are looking pretty good for the Giant release that is coming up in the next month. any chance of having a compilable cephfs kernel module for el7 for the next major release? stijn ___ ceph-users ma

Re: [ceph-users] ceph issue: rbd vs. qemu-kvm

2014-09-17 Thread Stijn De Weirdt
hi steven, we ran into issues when trying to use a non-default user ceph user in opennebula (don't remeber what the default was; but it's probably not libvirt2 ), patches are in https://github.com/OpenNebula/one/pull/33, devs sort-of confirmed they will be in 4.8.1. this way you can set CEPH_

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Stijn De Weirdt
hi christian, we once were debugging some performance isssues, and IRQ balancing was one of the issues we looked in, but no real benefit there for us. all interrupts on one cpu is only an issue if the hardware itself is not the bottleneck. we were running some default SAS HBA (Dell H200), and

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Stijn De Weirdt
but another issue is the OSD processes: do you pin those as well? and how much data do they actually handle. to checksum, the OSD process needs all data, so that can also cause a lot of NUMA traffic, esp if they are not pinned. That's why all my (production) storage nodes have only a single 6 or

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Stijn De Weirdt
e for each OSD; can these be HT cores or actual physical cores? stijn Regards, Anand -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stijn De Weirdt Sent: Monday, September 22, 2014 2:36 PM To: ceph-users@lists.ceph.com Subject: Re: [ceph-

[ceph-users] NVMe Journal and Mixing IO

2015-05-04 Thread Atze de Vries
Hi, We are designing a new Ceph cluster. Some of the cluster wil be used to run vms and most of it wil be used for file storage and object storage. We want to separate the workload for vms (high IO / small block) from the bulk storage (big block lots of latency) since mixing IO seems to be a bad i

[ceph-users] RadosGW not working after upgrade to Hammer

2015-05-26 Thread Arnoud de Jonge
nalServer /var/www/s3gw.fcgi -socket /var/run/ceph/ceph.radosgw.controller1.fastcgi.sock AllowEncodedSlashes on Any thoughts on what could be wrong? Kind regards, Arnoud de Jonge ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] ceph-disk activate /dev/sda1 seem to get stuck?

2015-06-05 Thread Jelle de Jong
going wrong and how to fix it? Kind regards, Jelle de Jong ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph-disk activate /dev/sda1 seem to get stuck?

2015-06-08 Thread Jelle de Jong
On 05/06/15 21:50, Jelle de Jong wrote: > I am new to ceph and I am trying to build a cluster for testing. > > after running: > ceph-deploy osd prepare --zap-disk ceph02:/dev/sda > > It seems udev rules find the disk and try to activate them, but then > gets stuck: > &

Re: [ceph-users] ceph-disk activate /dev/sda1 seem to get stuck?

2015-06-08 Thread Jelle de Jong
version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047) (from https://packages.debian.org/jessie/ceph) I will try to purge everything and retry to make sure there is no "old" date intervening. Does anyone knows what is going on? Kind regards, Jelle de Jong On 08/06/15 09:58, Christian Balzer wrote: > > Hello, >

[ceph-users] how do i install ceph from apt on debian jessie?

2015-06-08 Thread Jelle de Jong
://paste.debian.net/211955/ How do I install ceph on Debian Jessie (8.1)? Kind regards, Jelle de Jong ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how do i install ceph from apt on debian jessie?

2015-06-08 Thread Jelle de Jong
On 08/06/15 13:22, Jelle de Jong wrote: > I could not get ceph to work with the ceph packages shipped with debian > jessie: http://paste.debian.net/211771/ > > So I tried to use apt-pinning to use the eu.ceph.com apt repository, but > there are to many dependencies that are unreso

[ceph-users] SSD test results with Plextor M6 Pro, HyperX Fury, Kingston V300, ADATA SP90

2015-06-18 Thread Jelle de Jong
er loss of all nodes at the same time should not be possible (or has an extreme low probability) #4 how to benchmarks the OSD (disk+ssd-journal) combination so I can compare them. I got some other benchmarks question, but I will make an separate mail for them. Kind regards, Jel

[ceph-users] reversing the removal of an osd (re-adding osd)

2015-06-19 Thread Jelle de Jong
exist. create it before updating the crush map failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.5 --keyring=/var/lib/ceph/osd/ceph-5/keyring osd crush create-or-move -- 5 0.91 host=ceph03 root=default' Can somebody show me some examples of the right commands to re

Re: [ceph-users] reversing the removal of an osd (re-adding osd)

2015-06-19 Thread Jelle de Jong
On 19/06/15 16:07, Jelle de Jong wrote: > Hello everybody, > > I'm doing some experiments and I am trying to re-add an removed osd. I > removed it with the bellow five commands. > > http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ > > ceph osd out 5

[ceph-users] how to recover from: 1 pgs down; 10 pgs incomplete; 10 pgs stuck inactive; 10 pgs stuck unclean

2015-07-13 Thread Jelle de Jong
ow could I figure out in what pool the data was lost and in what rbd volume (so what kvm guest lost data). Kind regards, Jelle de Jong ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to recover from: 1 pgs down; 10 pgs incomplete; 10 pgs stuck inactive; 10 pgs stuck unclean

2015-07-15 Thread Jelle de Jong
On 13/07/15 15:40, Jelle de Jong wrote: > I was testing a ceph cluster with osd_pool_default_size = 2 and while > rebuilding the OSD on one ceph node a disk in an other node started > getting read errors and ceph kept taking the OSD down, and instead of me > executing ceph osd set nodo

Re: [ceph-users] how to recover from: 1 pgs down; 10 pgs incomplete; 10 pgs stuck inactive; 10 pgs stuck unclean

2015-07-22 Thread Jelle de Jong
On 15/07/15 10:55, Jelle de Jong wrote: > On 13/07/15 15:40, Jelle de Jong wrote: >> I was testing a ceph cluster with osd_pool_default_size = 2 and while >> rebuilding the OSD on one ceph node a disk in an other node started >> getting read errors and ceph kept taking the OSD

Re: [ceph-users] dropping old distros: el6, precise 12.04, debian wheezy?

2015-07-30 Thread Stijn De Weirdt
i would certainly like that all client libs and/or kernel modules stay tested and supported on these OSes for future ceph releases. not sure how much work that is, but the at least client side shouldn't be affected by the init move. stijn On 07/30/2015 04:43 PM, Marc wrote: Hi, much like de

Re: [ceph-users] Check networking first?

2015-07-30 Thread Stijn De Weirdt
wouldn't it be nice that ceph does something like this in background (some sort of network-scrub). debugging network like this is not that easy (can't expect admins to install e.g. perfsonar on all nodes and/or clients) something like: every X min, each service X pick a service Y on another h

Re: [ceph-users] Check networking first?

2015-08-03 Thread Stijn De Weirdt
Like a lot of system monitoring stuff, this is the kind of thing that in an ideal world we wouldn't have to worry about, but the experience in practice is that people deploy big distributed storage systems without having really good monitoring in place. We (people providing not to become complete

Re: [ceph-users] btrfs w/ centos 7.1

2015-08-08 Thread Stijn De Weirdt
hi jan, The answer to this, as well as life, universe and everything, is simple: ZFS. is it really the case for ceph? i briefly looked in the filestore code a while ago, since zfs is COW, i expected not to have a journal with ZFS, but i couldn't find anything that suggested this was supported

Re: [ceph-users] SSD test results with Plextor M6 Pro, HyperX Fury, Kingston V300, ADATA SP90

2015-09-01 Thread Jelle de Jong
sible scheduler (noop) changes persistent (cmd in rc.local or special udev rules, examples?) Kind regards, Jelle de Jong On 23/06/15 12:41, Jan Schermer wrote: > Those are interesting numbers - can you rerun the test with write cache > enabled this time? I wonder how much your d

Re: [ceph-users] the state of cephfs in giant

2014-10-15 Thread Stijn De Weirdt
We've been doing a lot of work on CephFS over the past few months. This is an update on the current state of things as of Giant. ... * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse or libcephfs) clients are in good working order. Thanks for all the work and speci

Re: [ceph-users] use ZFS for OSDs

2014-10-29 Thread Stijn De Weirdt
hi michal, thanks for the info. we will certainly try it and see if we come to the same conclusions ;) one small detail: since you were using centos7, i'm assuming you were using ZoL 0.6.3? stijn On 10/29/2014 08:03 PM, Michal Kozanecki wrote: Forgot to mention, when you create the ZFS/Z

Re: [ceph-users] Redundant Power Supplies

2014-10-30 Thread Stijn De Weirdt
if you don't have 2 powerfeeds, don't spend the money. if you have 2 feeds, well, start with 2 PSUs for your switches ;) if you stick with one PSU for the OSDs, make sure you have your cabling (power and network, don't forget your network switches should be on same power feeds ;) and crushmap r

Re: [ceph-users] Test 6

2014-12-15 Thread Leen de Braal
gt; Lindsay > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- L. de Braal BraHa Systems NL - Terneuzen T +31 115 649333 ___ ceph-users mailing list ceph-users@lists.ceph

Re: [ceph-users] Dell H310

2014-03-07 Thread Stijn De Weirdt
we tried this with a Dell H200 (also LSI2008 based). however, running some basic benchmarks, we saw no immediate difference between IT and IR firmware. so i'd like to know: what kind of performance improvement do you get, and how did you measure it it? thanks a lot stijn the howto for de

Re: [ceph-users] Dell H310

2014-03-07 Thread Stijn De Weirdt
we tried this with a Dell H200 (also LSI2008 based). however, running some basic benchmarks, we saw no immediate difference between IT and IR firmware. so i'd like to know: what kind of performance improvement do you get, and how did you measure it it? IMO, flashing to IT firmware is mainly don

Re: [ceph-users] clock skew

2014-03-13 Thread Stijn De Weirdt
can we retest the clock skew condition? or get the value that the skew is? ceph status gives health HEALTH_WARN clock skew detected on mon.ceph003 in a polysh session (ie parallel ssh sort of thing) ready (3)> date +%s.%N ceph002 : 1394713567.184218678 ceph003 : 1394713567.182722045 ceph001 : 1

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-02 Thread Stijn De Weirdt
hi gregory, (i'm a colleague of kenneth) 1) How big and what shape the filesystem is. Do you have some extremely large directory that the MDS keeps trying to load and then dump? anyway to extract this from the mds without having to start it? as it was an rsync operation, i can try to locate po

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-02 Thread Stijn De Weirdt
hi, 1) How big and what shape the filesystem is. Do you have some extremely large directory that the MDS keeps trying to load and then dump? anyway to extract this from the mds without having to start it? as it was an rsync operation, i can try to locate possible candidates on the source filesy

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-02 Thread Stijn De Weirdt
#42 @ http://inktank.com | http://ceph.com On Wed, Apr 2, 2014 at 9:23 AM, Stijn De Weirdt wrote: hi, 1) How big and what shape the filesystem is. Do you have some extremely large directory that the MDS keeps trying to load and then dump? anyway to extract this from the mds without having to st

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-02 Thread Stijn De Weirdt
in the tree? thanks, stijn On 04/02/2014 08:58 PM, Stijn De Weirdt wrote: wow. kudos for integrating this in ceph. more projects should do it like that! anyway, in attachement a gzipped ps file. heap is at 4.4GB, top reports 6.5GB mem usage. care to point out what to look for? i'll s

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-02 Thread Stijn De Weirdt
2:58 AM, Stijn De Weirdt wrote: wow. kudos for integrating this in ceph. more projects should do it like that! anyway, in attachement a gzipped ps file. heap is at 4.4GB, top reports 6.5GB mem usage. care to point out what to look for? i'll send a new one when the usage is starting to cau

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-04 Thread Stijn De Weirdt
hi yan, (taking the list in CC) On 04/04/2014 04:44 PM, Yan, Zheng wrote: On Thu, Apr 3, 2014 at 2:52 PM, Stijn De Weirdt wrote: hi, latest pprof output attached. this is no kernel client, this is ceph-fuse on EL6. starting the mds without any ceph-fuse mounts works without issue. mounting

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-15 Thread Stijn De Weirdt
solve this issue. ok, we'll rebuild and try asap stijn Regards Yan, Zheng Thanks! Regards Yan, Zheng Thanks! Kenneth - Message from Stijn De Weirdt - Date: Fri, 04 Apr 2014 20:31:34 +0200 From: Stijn De Weirdt Subject: Re: [ceph-users] ceph 0.78 mon and md

Re: [ceph-users] [sepia] debian jessie repository ?

2015-09-05 Thread Jelle de Jong
ian.org/jessie-backports/ceph Kind regards, Jelle de Jong ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

2015-09-10 Thread Jelle de Jong
Hello Jan, I want to test your pincpus I got from github. I have a 2x CPU (X5550) with 4 core 16 threads system I have four OSD (4x WD1003FBYX) with SSD (SHFS37A) journal . I got three nodes like that. I am not sure how to configure prz-pincpus.conf # prz-pincpus.conf https://paste.debian.net/pl

[ceph-users] multi-datacenter crush map

2015-09-18 Thread Wouter De Borger
Hi all, I have found on the mailing list that it should be possible to have a multi datacenter setup, if latency is low enough. I would like to set this up, so that each datacenter has at least two replicas and each PG has a replication level of 3. In this

Re: [ceph-users] multi-datacenter crush map

2015-09-19 Thread Wouter De Borger
at am I missing. Wouter On Fri, Sep 18, 2015 at 10:10 PM, Gregory Farnum wrote: > On Fri, Sep 18, 2015 at 4:57 AM, Wouter De Borger > wrote: > > Hi all, > > > > I have found on the mailing list that it should be possible to have a > multi > > datacenter setup, i

Re: [ceph-users] multi-datacenter crush map

2015-09-21 Thread Wouter De Borger
Jg996GUP3gUl > OEoA > =PfhN > -END PGP SIGNATURE- > > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Sat, Sep 19, 2015 at 12:54 PM, Wouter De Borger > wrote: > > Ok, so if I understa

[ceph-users] chain replication scheme

2015-09-30 Thread Wouter De Borger
Hi all, In the original paper (RADOS: a scalable, reliable storage service for petabyte-scale storage clusters), three replication schemes were described (primary copy, chain and splay). Now the documentation only discusses primary copy. Does the chain scheme still exist? It would be much more ba

Re: [ceph-users] CephFS and page cache

2015-10-19 Thread Stijn De Weirdt
>>> So: the key thing to realise is that caching behaviour is full of >>> tradeoffs, and this is really something that needs to be tunable, so >>> that it can be adapted to the differing needs of different workloads. >>> Having an optional "hold onto caps for N seconds after file close" >>> sounds

Re: [ceph-users] Problem with infernalis el7 package

2015-11-11 Thread Stijn De Weirdt
did you recreate new rpms with same version/release? it would be better to make new rpms with different release (e.g. 9.2.0-1). we have snapshotted mirrors and nginx caches between ceph yum repo and the nodes that install the rpms, so cleaning the cache locally will not help. stijn On 11/11/2015

[ceph-users] Ceph OSD performance issue

2016-05-18 Thread Davie De Smet
see if the configured max op threads is currently being reached? Or is there any other bottleneck that I am overlooking? Any clear view on this would be appreciated. Kind regards, Davie De Smet Davie De Smet Director Technical Operations and Customer Services, Nomadesk davie.des...@noma

[ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

2017-07-04 Thread Maarten De Quick
Hi, Background: We're having issues with our index pool (slow requests / time outs causes crashing of an OSD and a recovery -> application issues). We know we have very big buckets (eg. bucket of 77 million objects with only 16 shards) that need a reshard so we were looking at the resharding proce

Re: [ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

2017-07-05 Thread Maarten De Quick
d instance id. > Note that I didn't care about the data in the bucket, I just wanted to > reshard the index so I could delete the bucket without my radosgw and > osds crashing due to out of memory issues. > > Regards, > Andreas > > On 4 July 2017 at 20:46, Maarten De Quick

Re: [ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

2017-07-05 Thread Maarten De Quick
df2f02017-07-05 08:50:19.898820 7ff4250219c0 1 -- 10.21.4.1:0/3313807338 <http://10.21.4.1:0/3313807338> mark_down 0x7ff4272e1d40 -- 0x7ff4272e2da02017-07-05 08:50:19.898938 7ff4250219c0 1 -- 10.21.4.1:0/3313807338 <http://10.21.4.1:0/3313807338> mark_down 0x7ff4272d8160 -- 0x7ff4272d6160

[ceph-users] luminous/bluetsore osd memory requirements

2017-08-10 Thread Stijn De Weirdt
hi all, we are planning to purchse new OSD hardware, and we are wondering if for upcoming luminous with bluestore OSDs, anything wrt the hardware recommendations from http://docs.ceph.com/docs/master/start/hardware-recommendations/ will be different, esp the memory/cpu part. i understand from coll

Re: [ceph-users] luminous/bluetsore osd memory requirements

2017-08-12 Thread Stijn De Weirdt
rabyte > recommendation will be more realistic than it was in the past (an > effective increase in memory needs), but also that it will be under much > better control than previously. > > On Thu, Aug 10, 2017 at 1:35 AM Stijn De Weirdt > wrote: > >> hi all, >&

Re: [ceph-users] luminous/bluetsore osd memory requirements

2017-08-12 Thread Stijn De Weirdt
endation of how fast your processor should > be... But making it based on how much GHz per TB is an invitation to > context switch to death. > > On Sat, Aug 12, 2017, 8:40 AM Stijn De Weirdt > wrote: > >> hi all, >> >> thanks for all the feedback. it&

[ceph-users] Fwd: Can't get fullpartition space

2017-08-17 Thread Maiko de Andrade
p, 1 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 bytes usage: 1054 MB used, 9185 MB / 10240 MB avail pgs: []´s Maiko de Andrade MAX Brasil Desenvolvedor de Sistemas +55 51 91251756 <(51)%209125-1756> http://about.me/maiko _

Re: [ceph-users] Fwd: Can't get fullpartition space

2017-08-18 Thread Maiko de Andrade
ctivate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.VA1j0C', '--osd-uuid', u'f48afefb-e146-4fdd-8e8f-1d7dba08ec75', '--setuser', 'ceph', '--setgroup', 'ceph']' returned non-zero exit status -6 [ceph][ERR

[ceph-users] ceph-fuse hanging on df with ceph luminous >= 12.1.3

2017-08-21 Thread Alessandro De Salvo
Hi, when trying to use df on a ceph-fuse mounted cephfs filesystem with ceph luminous >= 12.1.3 I'm having hangs with the following kind of messages in the logs: 2017-08-22 02:20:51.094704 7f80addb7700 0 client.174216 ms_handle_reset on 192.168.0.10:6789/0 The logs are only showing this

[ceph-users] ceph inconsistent pg missing ec object

2017-10-18 Thread Stijn De Weirdt
hi all, we have a ceph 10.2.7 cluster with a 8+3 EC pool. in that pool, there is a pg in inconsistent state. we followed http://ceph.com/geen-categorie/ceph-manually-repair-object/, however, we are unable to solve our issue. from the primary osd logs, the reported pg had a missing object. we fo

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-19 Thread Stijn De Weirdt
any other relevant data. You shouldn't need to do manual repair of > erasure-coded pools, since it has checksums and can tell which bits are > bad. Following that article may not have done you any good (though I > wouldn't expect it to hurt, either...)... > -Greg > > On

[ceph-users] PG's stuck unclean active+remapped

2017-10-19 Thread Roel de Rooy
n a rapid amount, we started losing osd's, one by one, and rebalance/recover started kicking in. As connectivity between de monitor servers appeared ok (ping connectivity was somehow still there, there was still a quorum visible and ceph commands worked on all three), we didn't suspect the m

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-20 Thread Stijn De Weirdt
om the others. Plus different log sizes? It's not making a ton of sense > at first glance. > -Greg > > On Thu, Oct 19, 2017 at 1:08 AM Stijn De Weirdt > wrote: > >> hi greg, >> >> i attached the gzip output of the query and some more info below. if you >&

[ceph-users] Problem activating osd's

2017-11-16 Thread de Witt, Shaun
Hi I have searched the threads for a resolution to this problem, but so far have had no success. First – my setup. I am trying to replicate the setup on the quick ceph-deploy pages. I have 4 virtual machines (virtualbox running SL7.3 – a CentOS clone). Iptables is not running on any nodes.

[ceph-users] how to replace journal ssd in one node ceph-deploy setup

2017-11-23 Thread Jelle de Jong
] /dev/sdg6 ceph journal, for /dev/sda1 [ceph04][DEBUG ] /dev/sdg7 ceph journal, for /dev/sdd1 Thank you in advance, Kind regards, Jelle de Jong ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Linux Meltdown (KPTI) fix and how it affects performance?

2018-01-05 Thread Stijn De Weirdt
- will affect >> librbd performance in the hypervisors. >> >> Does anybody have some information about how Meltdown or Spectre affect ceph >> OSDs and clients? >> >> Also, regarding Meltdown patch, seems to be a compilation option, meaning >> you coul

[ceph-users] cephfs degraded on ceph luminous 12.2.2

2018-01-08 Thread Alessandro De Salvo
Hi, I'm running on ceph luminous 12.2.2 and my cephfs suddenly degraded. I have 2 active mds instances and 1 standby. All the active instances are now in replay state and show the same error in the logs: mds1 2018-01-08 16:04:15.765637 7fc2e92451c0  0 ceph version 12.2.2 (cf0baee

Re: [ceph-users] cephfs degraded on ceph luminous 12.2.2

2018-01-08 Thread Alessandro De Salvo
Mon, 2018-01-08 at 17:21 +0100, Alessandro De Salvo wrote: Hi, I'm running on ceph luminous 12.2.2 and my cephfs suddenly degraded. I have 2 active mds instances and 1 standby. All the active instances are now in replay state and show the same error in the logs: mds1 2018-01-08 1

Re: [ceph-users] cephfs degraded on ceph luminous 12.2.2

2018-01-11 Thread Alessandro De Salvo
On 01/08/2018 05:40 PM, Alessandro De Salvo wrote: > > Thanks Lincoln, > > > > indeed, as I said the cluster is recovering, so there are pending ops: > > > > > > pgs: 21.034% pgs not active > > 1692310/24980804 objects degraded (6.7

[ceph-users] Luminous 12.2.2 OSDs with Bluestore crashing randomly

2018-01-30 Thread Alessandro De Salvo
Hi, we have several times a day different OSDs running Luminous 12.2.2 and Bluestore crashing with errors like this: starting osd.2 at - osd_data /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal 2018-01-30 13:45:28.440883 7f1e193cbd00 -1 osd.2 107082 log_to_monitors {default=true}

Re: [ceph-users] Luminous 12.2.2 OSDs with Bluestore crashing randomly

2018-01-31 Thread Alessandro De Salvo
itto: On Tue, Jan 30, 2018 at 5:49 AM Alessandro De Salvo <mailto:alessandro.desa...@roma1.infn.it>> wrote: Hi, we have several times a day different OSDs running Luminous 12.2.2 and Bluestore crashing with errors like this: starting osd.2 at - osd_data /var/lib/ceph

Re: [ceph-users] Ceph Bluestore performance question

2018-02-18 Thread Stijn De Weirdt
hi oliver, the IPoIB network is not 56gb, it's probably a lot less (20gb or so). the ib_write_bw test is verbs/rdma based. do you have iperf tests between hosts, and if so, can you share those reuslts? stijn > we are just getting started with our first Ceph cluster (Luminous 12.2.2) and > doing

Re: [ceph-users] CephFS: No space left on device

2016-10-10 Thread Davie De Smet
d5-freu mds 'allow' mon 'allow rwx' osd 'allow rwx' ” · Tweaked the default settings up for mds_max_purge_files and mds_max_purge_ops_per_pg. Unfortunatly alll of this did not help. The MDS strays are still at 2.8M and growing. We do use hardlinks on our system.

Re: [ceph-users] CephFS: No space left on device

2016-10-11 Thread Davie De Smet
Or do you mean that I'm required to do a small touch/write on all files that have not yet been deleted (this would be painfull as the cluster is 200TB+)? Kind regards, Davie De Smet -Original Message- From: Gregory Farnum [mailto:gfar...@redhat.com] Sent: Monday, October 10, 2016

Re: [ceph-users] CephFS: No space left on device

2016-10-12 Thread Davie De Smet
a heads up. Kind regards, Davie De Smet Director Technical Operations and Customer Services, Nomadesk +32 9 240 10 31 (Office) -Original Message- From: Gregory Farnum [mailto:gfar...@redhat.com] Sent: Wednesday, October 12, 2016 2:11 AM To: Davie De Smet Cc: Mykola Dvornik ; John Spr

Re: [ceph-users] CephFS: No space left on device

2016-10-12 Thread Davie De Smet
Hi, That sounds great. I'll certainly try it out. Kind regards, Davie De Smet -Original Message- From: Yan, Zheng [mailto:uker...@gmail.com] Sent: Wednesday, October 12, 2016 3:41 PM To: Davie De Smet Cc: Gregory Farnum ; ceph-users Subject: Re: [ceph-users] CephFS: No space

[ceph-users] 10Gbit switch advice for small ceph cluster upgrade

2016-10-27 Thread Jelle de Jong
X520-SR1 Kind regards, Jelle de Jong GNU/Linux Consultant ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Looking for some advise on distributed FS: Is Ceph the right option for me?

2018-07-10 Thread Jones de Andrade
Hi all. I'm looking for some information on several distributed filesystems for our application. It looks like it finally came down to two candidates, Ceph being one of them. But there are still a few questions about ir that I would really like to clarify, if possible. Our plan, initially on 6 w

[ceph-users] MDS damaged

2018-07-11 Thread Alessandro De Salvo
Hi, after the upgrade to luminous 12.2.6 today, all our MDSes have been marked as damaged. Trying to restart the instances only result in standby MDSes. We currently have 2 filesystems active and 2 MDSes each. I found the following error messages in the mon: mds.0 :6800/2412911269 down:dama

Re: [ceph-users] MDS damaged

2018-07-11 Thread Alessandro De Salvo
e damage before issuing the "repaired" command? What is the history of the filesystems on this cluster? On Wed, Jul 11, 2018 at 8:10 AM Alessandro De Salvo <mailto:alessandro.desa...@roma1.infn.it>> wrote: Hi, after the upgrade to luminous 12.2.6 today, all our MDS

Re: [ceph-users] MDS damaged

2018-07-11 Thread Alessandro De Salvo
, 2018 at 4:10 PM Alessandro De Salvo wrote: Hi, after the upgrade to luminous 12.2.6 today, all our MDSes have been marked as damaged. Trying to restart the instances only result in standby MDSes. We currently have 2 filesystems active and 2 MDSes each. I found the following error messages in the

Re: [ceph-users] MDS damaged

2018-07-11 Thread Alessandro De Salvo
controllers, but 2 of the OSDs with 10.14 are on a SAN system and one on a different one, so I would tend to exclude they both had (silent) errors at the same time. Thanks,     Alessandro Il 11/07/18 18:56, John Spray ha scritto: On Wed, Jul 11, 2018 at 4:49 PM Alessandro De Salvo wrote:

Re: [ceph-users] MDS damaged

2018-07-12 Thread Alessandro De Salvo
> Il giorno 11 lug 2018, alle ore 23:25, Gregory Farnum ha > scritto: > >> On Wed, Jul 11, 2018 at 9:23 AM Alessandro De Salvo >> wrote: >> OK, I found where the object is: >> >> >> ceph osd map cephfs_metadata 200. >>

  1   2   3   >