Hi,
I rebooted a failed server, which is now showing a rogue filesystem mount.
Actually, there were also several disks missing in the node, all reported as
"prepared" by ceph-disk, but not activated.
[root@ceph2 ~]# grep /var/lib/ceph/tmp /etc/mtab
/dev/sdo1 /var/lib/ceph/tmp/mnt.usVRe8 xfs rw,n
Hi,
I am attempting to test the cephfs filesystem layouts.
I created a user with rights to write only in one pool :
client.puppet
key:zzz
caps: [mon] allow r
caps: [osd] allow rwx pool=puppet
I also created another pool in which I would assume this user is allowed to do
-users@lists.ceph.com
Objet : Re: [ceph-users] cephfs filesystem layouts : authentication gotchas ?
On 03/03/2015 15:21, SCHAER Frederic wrote:
>
> By the way : looks like the "ceph fs ls" command is inconsistent when
> the cephfs is mounted (I used a locally compiled kmod-c
Hi,
I've seen and read a few things about ceph-crush-location and I think that's
what I need.
What I need (want to try) is : a way to have SSDs in non-dedicated hosts, but
also to put those SSDs in a dedicated ceph root.
>From what I read, using ceph-crush-location, I could add a hostname with
-Message d'origine-
(...)
> So I just have to associate the mountpoint with the device... provided OSD is
> mounted when the tool is called.
> Anyone willing to share experience with ceph-crush-location ?
>
Something like this? https://gist.github.com/wido/5d26d88366e28e25e23d
I've us
Hi again,
On my testbed, I have 5 ceph nodes, each containing 23 OSDs (2TB btrfs drives).
For these tests, I've setup a RAID0 on the 23 disks.
For now, I'm not using SSDs as I discovered my vendor apparently decreased
their perfs on purpose...
So : 5 server nodes of which 3 are MONS too.
I also
;)
Regards
De : Nick Fisk [mailto:n...@fisk.me.uk]
Envoyé : jeudi 23 avril 2015 17:21
À : SCHAER Frederic; ceph-users@lists.ceph.com
Objet : RE: read performance VS network usage
Hi Frederic,
If you are using EC pools, the primary OSD requests the remaining shards of the
object from the other
They are also receiving much more data than what rados bench reports (around
275MB/s each)... would that be some sort of data amplification ??
Regards
De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de SCHAER
Frederic
Envoyé : vendredi 24 avril 2015 10:03
À : Nick Fisk
And to reply to myslef...
The client apparent network bandwidth is just the fact that dstat aggregates
the bridge network interface and the physical interface, thus doubling the
data...
Ah ah ah.
Regards
De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de SCHAER
Frederic
Hi,
As I explained in various previous threads, I'm having a hard time getting the
most out of my test ceph cluster.
I'm benching things with rados bench.
All Ceph hosts are on the same 10GB switch.
Basically, I know I can get about 1GB/s of disk write performance per host,
when I bench things
I gave you more insights on what I’m trying to achieve, and where I’m
failing ?
Regards
-Message d'origine-
De : Gregory Farnum [mailto:g...@gregs42.com]
Envoyé : mercredi 22 juillet 2015 16:01
À : Florent MONTHEL
Cc : SCHAER Frederic; ceph-users@lists.ceph.com
Objet : Re: [ceph
Hi,
Well I think the journaling would still appear in the dstat output, as that's
still IOs : even if the user-side bandwidth indeed is cut in half, that should
not be the case of disks IO.
For instance I just tried a replicated pool for the test, and got around
1300MiB/s in dstat for about 600
Envoyé : jeudi 23 juillet 2015 14:18
À : ceph-users@lists.ceph.com
Cc : Gregory Farnum; SCHAER Frederic
Objet : Re: [ceph-users] Ceph 0.94 (and lower) performance on >1 hosts ??
On Thu, 23 Jul 2015 11:14:22 +0100 Gregory Farnum wrote:
> Your note that dd can do 2GB/s without networking mak
ideas/ ... :'(
Regards
-Message d'origine-
De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de SCHAER
Frederic
Envoyé : vendredi 24 juillet 2015 16:04
À : Christian Balzer; ceph-users@lists.ceph.com
Objet : [PROVENANCE INTERNET] Re: [ceph-users] Ceph 0.94 (an
De : Jake Young [mailto:jak3...@gmail.com]
Envoyé : mercredi 29 juillet 2015 17:13
À : SCHAER Frederic
Cc : ceph-users@lists.ceph.com
Objet : Re: [ceph-users] Ceph 0.94 (and lower) performance on >1 hosts ??
On Tue, Jul 28, 2015 at 11:48 AM, SCHAER Frederic
mailto:frederic.sch...@cea
Hi,
I am setting up a test ceph cluster, on decommissioned hardware (hence : not
optimal, I know).
I have installed CentOS7, installed and setup ceph mons and OSD machines using
puppet, and now I'm trying to add OSDs with the servers OSD disks... and I have
issues (of course ;) )
I used the Ce
27;/sbin/blkid' returned non-zero exit status 2
+ exit
+ exec
regards
Frederic.
P.S : in your puppet module, it seems impossible to specify osd disks by path,
i.e :
ceph::profile::params::osds:
'/dev/disk/by-path/pci-\:0a\:00.0-scsi-0\:2\:':
(I tried without the backslashe
ition size: 10483713 sectors (5.0 GiB)
Attribute flags:
Partition name: 'ceph journal'
Puzzling, isn't it ?
-Message d'origine-
De : Loic Dachary [mailto:l...@dachary.org]
Envoyé : jeudi 9 octobre 2014 15:37
À : SCHAER Frederic; ceph-users@lists.ce
-Message d'origine-
De : Loic Dachary [mailto:l...@dachary.org]
Envoyé : jeudi 9 octobre 2014 16:20
À : SCHAER Frederic; ceph-users@lists.ceph.com
Objet : Re: [ceph-users] ceph-dis prepare :
UUID=----
On 09/10/2014 16:04, SCHAER Frederic wrote:
-Message d'origine-
De : Loic Dachary [mailto:l...@dachary.org]
The failure
journal check: ondisk fsid ---- doesn't match
expected 244973de-7472-421c-bb25-4b09d3f8d441
and the udev logs
DEBUG:ceph-disk:Journal /dev/sdc2 has OSD UUID
--
hat once every 20
(?) times it "surprisingly does not fail" on this hardware/os combination ;)
Regards
-Message d'origine-
De : Loic Dachary [mailto:l...@dachary.org]
Envoyé : vendredi 10 octobre 2014 14:37
À : SCHAER Frederic; ceph-users@lists.ceph.com
Objet : Re: [ceph-users]
Hi loic,
Back on this issue...
Using the epel package, I still get " prepared-only " disks, e.g :
/dev/sdc :
/dev/sdc1 ceph data, prepared, cluster ceph, journal /dev/sdc2
/dev/sdc2 ceph journal, for /dev/sdc1
Looking at udev output, I can see that there is no "ACTION=add" with
ID_PART_ENTRY_
Hi,
I'm used to RAID software giving me the failing disks slots, and most often
blinking the disks on the disk bays.
I recently installed a DELL "6GB HBA SAS" JBOD card, said to be an LSI 2008
one, and I now have to identify 3 pre-failed disks (so says S.M.A.R.T) .
Since this is an LSI, I tho
. If you match along the lines of
>>
>> KERNEL=="sd*[a-z]", KERNELS=="end_device-*:*:*"
>>
>> then you'll just have to cat "/sys/class/sas_device/${1}/bay_identifier"
>> in a script (with $1 being the $id of udev after that match, so
do that)
Regards
De : Craig Lewis [mailto:cle...@centraldesktop.com]
Envoyé : lundi 17 novembre 2014 22:32
À : SCHAER Frederic
Cc : ceph-users@lists.ceph.com
Objet : Re: [ceph-users] jbod + SMART : how to identify failing disks ?
I use `dd` to force activity to the disk I want to replace, and
learly this cannot be put in production as is and I'll have to find a way.
Regards
-Message d'origine-
De : Carl-Johan Schenström [mailto:carl-johan.schenst...@gu.se]
Envoyé : lundi 17 novembre 2014 14:14
À : SCHAER Frederic; Scottix; Erik Logtenberg
Cc : ceph-users@lists.
Hi,
I rebooted a node (I'm doing some tests, and breaking many things ;) ), I see I
have :
[root@ceph0 ~]# mount|grep sdp1
/dev/sdp1 on /var/lib/ceph/tmp/mnt.eml1yz type xfs
(rw,noatime,attr2,inode64,noquota)
/dev/sdp1 on /var/lib/ceph/osd/ceph-55 type xfs
(rw,noatime,attr2,inode64,noquota)
[
ois.lefilla...@uni.lu]
Envoyé : mercredi 19 novembre 2014 13:42
À : SCHAER Frederic
Cc : ceph-users@lists.ceph.com
Objet : Re: [ceph-users] jbod + SMART : how to identify failing disks ?
Hello again,
So whatever magic allows the Dell MD1200 to report the slot position for
each disk isn't prese
Hi,
Forgive the question if the answer is obvious... It's been more than "an hour
or so" and eu.ceph.com apparently still hasn't been re-signed or at least what
I checked wasn't :
# rpm -qp --qf '%{RSAHEADER:pgpsig}'
http://eu.ceph.com/rpm-hammer/el7/x86_64/ceph-0.94.3-0.el7.centos.x86_64.rpm
15:05, SCHAER Frederic wrote:
> Hi,
>
> Forgive the question if the answer is obvious... It's been more than "an hour
> or so" and eu.ceph.com apparently still hasn't been re-signed or at least
> what I checked wasn't :
>
> # rpm -qp --qf '%{RSAHEA
Hi,
With 5 hosts, I could successfully create pools with k=4 and m=1, with the
failure domain being set to "host".
With 6 hosts, I could also create k=4,m=1 EC pools.
But I suddenly failed with 6 hosts k=5 and m=1, or k=4,m=2 : the PGs were never
created - I reused the pool name for my tests, th
Hi,
I'm < sort of > following the upgrade instructions on CentOS 7.2.
I upgraded 3 OSD nodes without too many issues, even if I would rewrite those
upgrade instructions to :
#chrony has ID 167 on my systems... this was set at install time ! but I use
NTP anyway.
yum remove chrony
sed -i -e '
I believe this is because I did not read the instruction thoroughly enough...
this is my first "live upgrade"
-Message d'origine-
De : Oleksandr Natalenko [mailto:oleksa...@natalenko.name]
Envoyé : lundi 2 mai 2016 16:39
À : SCHAER Frederic ; ceph-us...@ceph.com
Ob
I got that sorted out.
I had to re-create the MON data directory on this node, and this seems to have
been enough, even if not trivial.
I'm not sure if it helped that I destroyed cephfs (not yet used) and the MDS
daemons... I was lulcky I could do that.
/lesson learned : upgrade MONs first/
Hi,
--
First, let me start with the bonus...
I migrated from hammer => jewel and followed the migration instructions... but
migrations instructions are missing this :
#chown -R ceph:ceph /var/log/ceph
I just discoved this was the reason I found no log nowhere about my current
issue :/
--
This
I do…
In my case, I have collocated the MONs with some OSDs, and no later than
Saturday when I lost data again, I found out that one of the MON+OSD nodes ran
out of memory and started killing ceph-mon on that node…
At the same moment, all OSDs started to complain about not being able to see
oth
Hi,
Same for me... unsetting the bitwise flag considerably lowered the number of
unfound objects.
I'll have to wait/check for the remaining 214 though...
Cheers
-Message d'origine-
De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de Samuel
Just
Envoyé : jeudi 2 jui
Hi,
I'm facing the same thing after I reinstalled a node directly in jewel...
Reading : http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/31917
I can confirm that running : "udevadm trigger -c add -s block " fires the udev
rules and gets ceph-osd up.
Thing is : I now have reinstalled
Hi,
Every now and then , sectors die on disks.
When this happens on my bluestore (kraken) OSDs, I get 1 PG that becomes
degraded.
The exact status is :
HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
pg 12.127 is active+clean+inconsistent, acting [141,67,85]
If I do a # rados list-inconsistent-
Hi,
I read the 12.2.7 upgrade notes, and set "osd skip data digest = true" before I
started upgrading from 12.2.6 on my Bluestore-only cluster.
As far as I can tell, my OSDs all got restarted during the upgrade and all got
the option enabled :
This is what I see for a specific OSD taken at rand
Regards
De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de SCHAER
Frederic
Envoyé : mardi 24 juillet 2018 15:01
À : ceph-users
Objet : [PROVENANCE INTERNET] [ceph-users] 12.2.7 + osd skip data digest +
bluestore + I/O errors
Hi,
I read the 12.2.7 upgrade notes, and set "osd ski
m]
Envoyé : mardi 24 juillet 2018 16:50
À : SCHAER Frederic
Cc : ceph-users
Objet : Re: [ceph-users] 12.2.7 + osd skip data digest + bluestore + I/O errors
`ceph versions` -- you're sure all the osds are running 12.2.7 ?
osd_skip_data_digest = true is supposed to skip any crc checks du
quot;stable"}
On the good side : this update is forcing us to dive into ceph internals :
we'll be more ceph-aware tonight than this morning ;)
Cheers
Fred
-Message d'origine-
De : SCHAER Frederic
Envoyé : mercredi 25 juillet 2018 09:57
À : 'Dan van der Ster'
n] finish_promote unexpected
promote error (5) Input/output error
And I don't see object rbd_data.1920e2238e1f29.0dfc (:head ?) in
the unflush-able objects...
Cheers
-Message d'origine-----
De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de SCHAER
Fre
Hi,
For those facing (lots of) active+clean+inconsistent PGs after the luminous
12.2.6 metadata corruption and 12.2.7 upgrade, I'd like to explain how I
finally got rid of those.
Disclaimer : my cluster doesn't contain highly valuable data, and I can sort of
recreate what is actually contains
Hi,
I have 5 data nodes (bluestore, kraken), each with 24 OSDs.
I enabled the optimal crush tunables.
I'd like to try to "really" use EC pools, but until now I've faced cluster
lockups when I was using 3+2 EC pools with a host failure domain.
When a host was down for instance ;)
Since I'd like t
Hi,
I just started testing VMs inside ceph this week, ceph-hammer 0.94-5 here.
I built several pools, using pool tiering:
- A small replicated SSD pool (5 SSDs only, but I thought it'd be
better for IOPS, I intend to test the difference with disks only)
- Overlaying a larger
nvoyé : mercredi 24 février 2016 19:16
À : SCHAER Frederic
Cc : ceph-us...@ceph.com; HONORE Pierre-Francois
Objet : Re: [ceph-users] ceph hammer : rbd info/Status : operation not
supported (95) (EC+RBD tier pools)
If you run "rados -p ls | grep "rbd_id." and don't see
that obj
Hi,
I'm sure I'm doing something wrong, I hope someone can enlighten me...
I'm encountering many issues when I restart a ceph server (any ceph server).
This is on CentOS 7.2, ceph-0.94.6-0.el7.x86_64.
Firt : I have disabled abrt. I don't need abrt.
But when I restart, I see these logs in the sys
Hi,
I'm sure I'm doing something wrong, I hope someone can enlighten me...
I'm encountering many issues when I restart a ceph server (any ceph server).
This is on CentOS 7.2, ceph-0.94.6-0.el7.x86_64.
Firt : I have disabled abrt. I don't need abrt.
But when I restart, I see these logs in the sys
Hi,
One simple/quick question.
In my ceph cluster, I had a disk wich was in predicted failure. It was so much
in predicted failure that the ceph OSD daemon crashed.
After the OSD crashed, ceph moved data correctly (or at least that's what I
thought), and a ceph -s was giving a "HEALTH_OK".
Perf
51 matches
Mail list logo