Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread Udo Lembke
Hi, if you add on more than one server an SSD with an short lifetime, you can run in real trouble (dataloss)! Even if, all other SSDs are enterprise grade. Ceph mix all data in PGs, which are spread over many disks - if one disk fails - no poblem, but if the next two fails after that due high io (r

Re: [ceph-users] crusmap show wrong osd for PGs (EC-Pool)

2018-06-30 Thread Udo Lembke
Hi again, On 29.06.2018 17:37, ulem...@polarzone.de wrote: > ... > 24.cc crushmap: [8,111,12,88,128,44,56] > real live:  [8,121, X,88,130,44,56] - due the new osd-12 and the > wrong searchlist (osd-121 + osd-130) the PG is undersized! > > /var/lib/ceph/osd/ceph-8/current/24.ccs0_head > /var/li

Re: [ceph-users] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous

2017-07-16 Thread Udo Lembke
Hi, On 16.07.2017 15:04, Phil Schwarz wrote: > ... > Same result, the OSD is known by the node, but not by the cluster. > ... Firewall? Or missmatch in /etc/hosts or DNS?? Udo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/l

Re: [ceph-users] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous

2017-07-15 Thread Udo Lembke
Hi, On 15.07.2017 16:01, Phil Schwarz wrote: > Hi, > ... > > While investigating, i wondered about my config : > Question relative to /etc/hosts file : > Should i use private_replication_LAN Ip or public ones ? private_replication_LAN!! And the pve-cluster should use another network (nics) if poss

Re: [ceph-users] Re-weight Entire Cluster?

2017-05-29 Thread Udo Lembke
Hi Mike, On 30.05.2017 01:49, Mike Cave wrote: > > Greetings All, > > > > I recently started working with our ceph cluster here and have been > reading about weighting. > > > > It appears the current best practice is to weight each OSD according > to it’s size (3.64 for 4TB drive, 7.45 for 8TB

Re: [ceph-users] How to think a two different disk's technologies architecture

2017-03-23 Thread Udo Lembke
Hi, ceph speeds up with more nodes and more OSDs - so go for 6 nodes with mixed SSD+SATA. Udo On 23.03.2017 18:55, Alejandro Comisario wrote: > Hi everyone! > I have to install a ceph cluster (6 nodes) with two "flavors" of > disks, 3 servers with SSD and 3 servers with SATA. > > Y will purchase

Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-11 Thread Udo Lembke
Hi, thanks for the usefull infos. On 11.03.2017 12:21, cephmailingl...@mosibi.nl wrote: > > Hello list, > > A week ago we upgraded our Ceph clusters from Hammer to Jewel and with > this email we want to share our experiences. > > ... > > > e) find /var/lib/ceph/ ! -uid 64045 -print0|xargs -0

Re: [ceph-users] Testing a node by fio - strange results to me

2017-01-22 Thread Udo Lembke
rds > Ahmed > > > On Sun, Jan 22, 2017 at 6:45 PM, Udo Lembke <mailto:ulem...@polarzone.de>> wrote: > > Hi, > > I don't use mds, but I thinks it's the same like with rdb - the readed > data are cached on the OSD-nodes. > >

Re: [ceph-users] Testing a node by fio - strange results to me

2017-01-22 Thread Udo Lembke
Hi, I don't use mds, but I thinks it's the same like with rdb - the readed data are cached on the OSD-nodes. The 4MB-chunks of the 3G-file fit completly in the cache, the other not. Udo On 18.01.2017 07:50, Ahmed Khuraidah wrote: > Hello community, > > I need your help to understand a little

Re: [ceph-users] Why would "osd marked itself down" will not recognised?

2017-01-12 Thread Udo Lembke
Hi Sam, the webfrontend of an external ceph-dash was interrupted till the node was up again. The reboot took app. 5 min. But the ceph -w output shows some IO much faster. I will look tomorrow at the output again and create an ticket. Thanks Udo On 12.01.2017 20:02, Samuel Just wrote: > How

Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread Udo Lembke
Hi, but I assume you measure also cache in this scenario - the osd-nodes has cached the writes in the filebuffer (due this the latency should be very small). Udo On 12.12.2016 03:00, V Plus wrote: > Thanks Somnath! > As you recommended, I executed: > dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0

Re: [ceph-users] 10.2.4 Jewel released

2016-12-09 Thread Udo Lembke
Hi, unfortunately there are no Debian Jessie packages... Don't know that an recompile take such an long time for ceph... I think such an important fix should hit the repros faster. Udo On 09.12.2016 18:54, Francois Lafont wrote: > On 12/09/2016 06:39 PM, Alex Evonosky wrote: > >> Sounds grea

Re: [ceph-users] Help needed ! cluster unstable after upgrade from Hammer to Jewel

2016-11-16 Thread Udo Lembke
Hi, On 16.11.2016 19:01, Vincent Godin wrote: > Hello, > > We now have a full cluster (Mon, OSD & Clients) in jewel 10.2.2 > (initial was hammer 0.94.5) but we have still some big problems on our > production environment : > > * some ceph filesystem are not mounted at startup and we have to >

Re: [ceph-users] Need help! Ceph backfill_toofull and recovery_wait+degraded

2016-11-01 Thread Udo Lembke
Hi again, and change the value with something like this ceph tell osd.* injectargs '--mon_osd_full_ratio 0.96' Udo On 01.11.2016 21:16, Udo Lembke wrote: > Hi Marcus, > > for a fast help you can perhaps increase the mon_osd_full_ratio? > > What values do you have? &g

Re: [ceph-users] Need help! Ceph backfill_toofull and recovery_wait+degraded

2016-11-01 Thread Udo Lembke
Hi Marcus, for a fast help you can perhaps increase the mon_osd_full_ratio? What values do you have? Please post the output of (on host ceph1, because osd.0.asok) ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep full_ratio after that it would be helpfull to use on all hosts

Re: [ceph-users] multiple journals on SSD

2016-07-12 Thread Udo Lembke
Hi Vincent, On 12.07.2016 15:03, Vincent Godin wrote: > Hello. > > I've been testing Intel 3500 as journal store for few HDD-based OSD. I > stumble on issues with multiple partitions (>4) and UDEV (sda5, sda6,etc > sometime do not appear after partition creation). And I'm thinking that > partition

Re: [ceph-users] ceph storage capacity does not free when deleting contents from RBD volumes

2016-05-19 Thread Udo Lembke
Hi Albert, to free unused space you must enable trim (or do an fstrim) in the vm - and all things in the storage chain must support this. The normal virtio-driver don't support trim, but if you use scsi-disks with virtio-scsi-driver you can use it. Work well but need some time for huge filesystems.

Re: [ceph-users] Slow read on RBD mount, Hammer 0.94.5

2016-04-25 Thread Udo Lembke
Hi Mike, Am 21.04.2016 um 15:20 schrieb Mike Miller: Hi Udo, thanks, just to make sure, further increased the readahead: $ sudo blockdev --getra /dev/rbd0 1048576 $ cat /sys/block/rbd0/queue/read_ahead_kb 524288 No difference here. First one is sectors (512 bytes), second one KB. oops, sorr

Re: [ceph-users] Slow read on RBD mount, Hammer 0.94.5

2016-04-21 Thread Udo Lembke
Hi Mike, Am 21.04.2016 um 09:07 schrieb Mike Miller: Hi Nick and Udo, thanks, very helpful, I tweaked some of the config parameters along the line Udo suggests, but still only some 80 MB/s or so. this mean you have reached factor 3 (this are round about the value I see with single thread on R

Re: [ceph-users] Howto reduce the impact from cephx with small IO

2016-04-20 Thread Udo Lembke
particular these tests: http://www.spinics.net/lists/ceph-devel/msg22416.html Mark On 04/20/2016 11:50 AM, Udo Lembke wrote: Hi, on an small test-system (3 nodes (mon + osd), 6 OSDs, ceph 0.94.6) I compare with and without cephx. I use fio for that inside an VM on an host, outside the 3 ceph-nodes,

[ceph-users] Howto reduce the impact from cephx with small IO

2016-04-20 Thread Udo Lembke
Hi, on an small test-system (3 nodes (mon + osd), 6 OSDs, ceph 0.94.6) I compare with and without cephx. I use fio for that inside an VM on an host, outside the 3 ceph-nodes, with this command: fio --max-jobs=1 --numjobs=1 --readwrite=read --blocksize=4k --size=4G --direct=1 --name=fiojob_4k

Re: [ceph-users] Slow read on RBD mount, Hammer 0.94.5

2016-04-19 Thread Udo Lembke
Hi Mike, I don't have experiences with RBD mounts, but see the same effect with RBD. You can do some tuning to get better results (disable debug and so on). As hint some values from a ceph.conf: [osd] debug asok = 0/0 debug auth = 0/0 debug buffer = 0/0 debug client = 0/0

Re: [ceph-users] Deprecating ext4 support

2016-04-12 Thread Udo Lembke
Hi Sage, we run ext4 only on our 8node-cluster with 110 OSDs and are quite happy with ext4. We start with xfs but the latency was much higher comparable to ext4... But we use RBD only with "short" filenames like rbd_data.335986e2ae8944a.000761e1. If we can switch from Jewel to K* and

Re: [ceph-users] v0.94.6 Hammer released

2016-02-25 Thread Udo Lembke
Hi, Am 24.02.2016 um 17:27 schrieb Alfredo Deza: > On Wed, Feb 24, 2016 at 4:31 AM, Dan van der Ster wrote: >> Thanks Sage, looking forward to some scrub randomization. >> >> Were binaries built for el6? http://download.ceph.com/rpm-hammer/el6/x86_64/ > > We are no longer building binaries for e

Re: [ceph-users] All SSD Pool - Odd Performance

2015-11-22 Thread Udo Lembke
--verify=0 --verify_fatal=0 --numjobs=4 --rw=randwrite --blocksize=4k --group_reporting Udo On 22.11.2015 23:59, Udo Lembke wrote: > Hi Zoltan, > you are right ( but this was two running systems...). > > I see also an big failure: "--filename=/mnt/test.bin" (use simply > c

Re: [ceph-users] All SSD Pool - Odd Performance

2015-11-22 Thread Udo Lembke
ents clean tomorow. Udo On 22.11.2015 14:29, Zoltan Arnold Nagy wrote: > It would have been more interesting if you had tweaked only one option > as now we can’t be sure which changed had what impact… :-) > >> On 22 Nov 2015, at 04:29, Udo Lembke > <mailto:ulem...@pola

Re: [ceph-users] All SSD Pool - Odd Performance

2015-11-21 Thread Udo Lembke
Hi Sean, Haomai is right, that qemu can have a huge performance differences. I have done two test to the same ceph-cluster (different pools, but this should not do any differences). One test with proxmox ve 4 (qemu 2.4, iothread for device, and cache=writeback) gives 14856 iops Same test with prox

Re: [ceph-users] two or three replicas?

2015-11-03 Thread Udo Lembke
Hi, for production (with enough OSDs) is three replicas the right choice. The chance for data loss if two ODSs fails at one time is to high. And if this happens most of your data ist lost, because the data is spead over many OSDs... And yes - two replicas is faster for writes. Udo On 02.11.20

Re: [ceph-users] Network performance

2015-10-22 Thread Udo Lembke
Hi Jonas, you can create an bond over multible NICs (depends on your switch which modes are possible) to use one IP addresses but more than one NIC. Udo On 21.10.2015 10:23, Jonas Björklund wrote: > Hello, > > In the configuration I have read about "cluster network" and "cluster addr". > Is it

Re: [ceph-users] v0.94.4 Hammer released

2015-10-20 Thread Udo Lembke
Hi, do you have changed the ownership like discribed in Sages mail about "v9.1.0 Infernalis release candidate released"? #. Fix the ownership:: chown -R ceph:ceph /var/lib/ceph or set ceph.conf to use root instead? When upgrading, administrators have two options: #. Add th

Re: [ceph-users] Cache tier experiences (for ample sized caches ^o^)

2015-10-07 Thread Udo Lembke
Hi Christian, On 07.10.2015 09:04, Christian Balzer wrote: > > ... > > My main suspect for the excessive slowness are actually the Toshiba DT > type drives used. > We only found out after deployment that these can go into a zombie mode > (20% of their usual performance for ~8 hours if not perma

Re: [ceph-users] [sepia] debian jessie repository ?

2015-09-25 Thread Udo Lembke
Hi, you can use this sources-list cat /etc/apt/sources.list.d/ceph.list deb http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ref/v0.94.3 jessie main Udo On 25.09.2015 15:10, Jogi Hofmüller wrote: > Hi, > > Am 2015-09-11 um 13:20 schrieb Florent B: > >> Jessie repository will be available

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-07 Thread Udo Lembke
Hi Vickey, I had the same rados bench output after changing the motherboard of the monitor node with the lowest IP... Due to the new mainboard, I assume the hw-clock was wrong during startup. Ceph health show no errors, but all VMs aren't able to do IO (very high load on the VMs - but no traffic).

Re: [ceph-users] Storage node refurbishing, a "freeze" OSD feature would be nice

2015-08-30 Thread Udo Lembke
Hi Christian, for my setup "b" takes too long - too much data movement and stress to all nodes. I have simply (with replica 3) "set noout", reinstall one node (with new filesystem on the OSDs, but leave them in the crushmap) and start all OSDs (at friday night) - takes app. less than one day for

Re: [ceph-users] Different filesystems on OSD hosts at the samecluster

2015-08-07 Thread Udo Lembke
wadays for two reasons > 1) the default is "relatime" which has minimal impact on performance > 2) AFAIK some ceph features actually use atime (cache tiering was it?) or at > least so I gathered from some bugs I saw > > Jan > >> On 07 Aug 2015, at 16:30, Udo Lembke wr

Re: [ceph-users] НА: Different filesystems on OSD hosts at the samecluster

2015-08-07 Thread Udo Lembke
> > > > От: ceph-users от имени Burkhard Linke > > Отправлено: 7 августа 2015 г. 17:37 > Кому: ceph-users@lists.ceph.com > Тема: Re: [ceph-users] Different filesystems on OSD hosts at the > samecluster > > Hi, >

Re: [ceph-users] Different filesystems on OSD hosts at the samecluster

2015-08-07 Thread Udo Lembke
ext4 don't support an different journal-device, like xfs do, but I assume you mean the osd-jounal and not the filesystem journal?! Udo Am 07.08.2015 16:13, schrieb Burkhard Linke: > Hi, > > > On 08/07/2015 04:04 PM, Udo Lembke wrote: >> Hi, >> some time ago I switched

Re: [ceph-users] Different filesystems on OSD hosts at the same cluster

2015-08-07 Thread Udo Lembke
Hi, some time ago I switched all OSDs from XFS to ext4 (step by step). I had no issues during mixed osd-format (the process takes some weeks). And yes, for me ext4 performs also better (esp. the latencies). Udo Am 07.08.2015 13:31, schrieb Межов Игорь Александрович: > Hi! > > We do some perform

Re: [ceph-users] dropping old distros: el6, precise 12.04, debian wheezy?

2015-07-30 Thread Udo Lembke
Hi, dropping debian wheezy are quite fast - till now there aren't packages for jessie?! Dropping of squeeze I understand, but wheezy at this time? Udo On 30.07.2015 15:54, Sage Weil wrote: > As time marches on it becomes increasingly difficult to maintain proper > builds and packages for older

Re: [ceph-users] Did maximum performance reached?

2015-07-28 Thread Udo Lembke
Hi, On 28.07.2015 12:02, Shneur Zalman Mattern wrote: > Hi! > > And so, in your math > I need to build size = osd, 30 replicas for my cluster of 120TB - to get my > demans 30 replicas is the wrong math! Less replicas = more speed (because of less writing). More replicas less speed. Fore data

Re: [ceph-users] different omap format in one cluster (.sst + .ldb) - new installed OSD-node don't start any OSD

2015-07-23 Thread Udo Lembke
d when upgrading osd? > > How many osds meet this problems? > > This assert failure means that osd detects a upgraded pg meta object > but failed to read(or lack of 1 key) meta keys from object. > > On Thu, Jul 23, 2015 at 7:03 PM, Udo Lembke wrote: >> Am 21.07

Re: [ceph-users] different omap format in one cluster (.sst + .ldb) - new installed OSD-node don't start any OSD

2015-07-23 Thread Udo Lembke
Am 21.07.2015 12:06, schrieb Udo Lembke: > Hi all, > ... > > Normaly I would say, if one OSD-Node die, I simply reinstall the OS and ceph > and I'm back again... but this looks bad > for me. > Unfortunality the system also don't start 9 OSDs as I switched back to

[ceph-users] different omap format in one cluster (.sst + .ldb) - new installed OSD-node don't start any OSD

2015-07-21 Thread Udo Lembke
Hi all, we had an ceph cluster with 7 OSD-nodes (Debian Jessie (because patched tcmalloc) with ceph 0.94) which we expand with one further node. For this node we use puppet with Debian 7.8, because ceph 0.92.2 doesn't install on Jessie (upgrade 0.94.1 work on the other nodes but 0.94.2 looks not

Re: [ceph-users] He8 drives

2015-07-13 Thread Udo Lembke
Hi, I have just expand our ceph-cluster (7 nodes) with one 8TB HGST (change from 4TB to 8TB) on each node (and 11 4TB HGST). But I have set the primary affinity to 0 for the 8 TB-disks... in this case my performance values are not 8-TB-disk related. Udo On 08.07.2015 02:28, Blair Bethwaite wrote:

[ceph-users] Strange PGs on a osd which is reweight to 0

2015-07-02 Thread Udo Lembke
Hi all, I want to change an osd with an bigger one and reweight the osd to 0: ceph osd tree | grep osd.0 0 3.57999 osd.0 up0 1.0 cluster is healthy, but pg dump shows PGs which are primary on osd.0: root@ceph-01:~# ceph pg dump | grep "\[0," dumped all

Re: [ceph-users] How to estimate whether putting a journal on SSD will help with performance?

2015-05-01 Thread Udo Lembke
Hi, On 01.05.2015 10:30, Piotr Wachowicz wrote: > Is there any way to confirm (beforehand) that using SSDs for journals > will help? yes SSD-Journal helps a lot (if you use the right SSDs) for write speed, and I made the experiences that this also helped (but not too much) for read-performance. >

Re: [ceph-users] Hammer release data and a Design question

2015-03-27 Thread Udo Lembke
Hi, Am 26.03.2015 11:18, schrieb 10 minus: > Hi , > > I 'm just starting on small Ceph implementation and wanted to know the > release date for Hammer. > Will it coincide with relase of Openstack. > > My Conf: (using 10G and Jumboframes on Centos 7 / RHEL7 ) > > 3x Mons (VMs) : > CPU - 2 > Me

Re: [ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Udo Lembke
). The only way I see in the moment, is to create new rbd-disks and copy all blocks with rados get -> file -> rados put. The problem is the time it's take (days to weeks for 3 * 16TB)... Udo > -Greg > > On Thu, Mar 26, 2015 at 8:56 AM, Udo Lembke wrote: >> Hi Greg, &g

Re: [ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Udo Lembke
is different — it just tells you what > objects are present in the PG on that OSD right now. So any objects > which aren't in cache won't show up when listing on the cache pool. > -Greg > > On Thu, Mar 26, 2015 at 3:43 AM, Udo Lembke wrote: >> Hi all, >> due

[ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Udo Lembke
Hi all, due an very silly approach, I removed the cache tier of an filled EC pool. After recreate the pool and connect with the EC pool I don't see any content. How can I see the rbd_data and other files through the new ssd cache tier? I think, that I must recreate the rbd_directory (and fill wit

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-26 Thread Udo Lembke
magine that > you could have specified enough PGs to make it impossible to form PGs out of > 84 OSDs (I'm assuming your SSDs are in a separate root) but I have to ask... > > -don- > > > -Original Message- > From: Udo Lembke [mailto:ulem...@polarzone.de]

[ceph-users] "won leader election with quorum" during "osd setcrushmap"

2015-03-25 Thread Udo Lembke
Hi, due to PG-trouble with an EC-Pool I modify the crushmap (step set_choose_tries 200) from rule ec7archiv { ruleset 6 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step take default step chooseleaf indep 0 type host

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Udo Lembke
Hi Don, thanks for the info! looks that choose_tries set to 200 do the trick. But the setcrushmap takes a long long time (alarming, but the client have still IO)... hope it's finished soon ;-) Udo Am 25.03.2015 16:00, schrieb Don Doerner: > Assuming you've calculated the number of PGs reasona

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Udo Lembke
more than 300 PGs... Udo Am 25.03.2015 14:52, schrieb Gregory Farnum: > On Wed, Mar 25, 2015 at 1:20 AM, Udo Lembke wrote: >> Hi, >> due to two more hosts (now 7 storage nodes) I want to create an new >> ec-pool and get an strange effect: >> >> ceph@admin:~$ ceph

[ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Udo Lembke
Hi, due to two more hosts (now 7 storage nodes) I want to create an new ec-pool and get an strange effect: ceph@admin:~$ ceph health detail HEALTH_WARN 2 pgs degraded; 2 pgs stuck degraded; 2 pgs stuck unclean; 2 pgs stuck undersized; 2 pgs undersized pg 22.3e5 is stuck unclean since forever, curr

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Udo Lembke
Hi Tony, sounds like an good idea! Udo On 09.03.2015 21:55, Tony Harris wrote: > I know I'm not even close to this type of a problem yet with my small > cluster (both test and production clusters) - but it would be great if > something like that could appear in the cluster HEALTHWARN, if Ceph > co

[ceph-users] too few pgs in cache tier

2015-02-27 Thread Udo Lembke
Hi all, we use an EC-Pool with an small cache tier in front of, for our archive-data (4 * 16TB VM-disks). The ec-pool has k=3;m=2 because we startet with 5 nodes and want to migrate to an new ec-pool with k=5;m=2. Therefor we migrate one VM-disk (16TB) from the ceph-cluster to an fc-raid with the

Re: [ceph-users] Power failure recovery woes

2015-02-17 Thread Udo Lembke
Hi Jeff, is the osd /var/lib/ceph/osd/ceph-2 mounted? If not, does it helps, if you mounted the osd and start with service ceph start osd.2 ?? Udo Am 17.02.2015 09:54, schrieb Jeff: > Hi, > > We had a nasty power failure yesterday and even with UPS's our small (5 > node, 12 OSD) cluster is havi

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Udo Lembke
Hi, use: ceph osd crush set 0 0.01 pool=default host=ceph-node1 ceph osd crush set 1 0.01 pool=default host=ceph-node1 ceph osd crush set 2 0.01 pool=default host=ceph-node3 ceph osd crush set 3 0.01 pool=default host=ceph-node3 ceph osd crush set 4 0.01 pool=default host=ceph-node2 ceph osd crush

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Udo Lembke
Hi, your will get further trouble, because your weight is not correct. You need an weight >= 0.01 for each OSD. This mean, you OSD must be 10GB or greater! Udo Am 10.02.2015 12:22, schrieb B L: > Hi Vickie, > > My OSD tree looks like this: > > ceph@ceph-node3:/home/ubuntu$ ceph osd tree > # i

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-06 Thread Udo Lembke
Am 06.02.2015 09:06, schrieb Hector Martin: > On 02/02/15 03:38, Udo Lembke wrote: >> With 3 hosts only you can't survive an full node failure, because for >> that you need >> host >= k + m. > > Sure you can. k=2, m=1 with the failure domain set to host will su

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi Josh, thanks for the info. detach/reattach schould be fine for me, because it's only for performance testing. #2468 would be fine of course. Udo On 05.02.2015 08:02, Josh Durgin wrote: > On 02/05/2015 07:44 AM, Udo Lembke wrote: >> Hi all, >> is there any command to flus

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi Dan, I mean qemu-kvm, also librbd. But how I can kvm told to flush the buffer? Udo On 05.02.2015 07:59, Dan Mick wrote: > On 02/04/2015 10:44 PM, Udo Lembke wrote: >> Hi all, >> is there any command to flush the rbd cache like the >> "echo 3 > /proc/sys/vm/

[ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi all, is there any command to flush the rbd cache like the "echo 3 > /proc/sys/vm/drop_caches" for the os cache? Udo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Udo Lembke
Hi Marco, Am 04.02.2015 10:20, schrieb Colombo Marco: ... > We choosen the 6TB of disk, because we need a lot of storage in a small > amount of server and we prefer server with not too much disks. > However we plan to use max 80% of a 6TB Disk > 80% is too much! You will run into trouble. Ceph

Re: [ceph-users] estimate the impact of changing pg_num

2015-02-01 Thread Udo Lembke
Hi Xu, On 01.02.2015 21:39, Xu (Simon) Chen wrote: > RBD doesn't work extremely well when ceph is recovering - it is common > to see hundreds or a few thousands of blocked requests (>30s to > finish). This translates high IO wait inside of VMs, and many > applications don't deal with this well. th

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-01 Thread Udo Lembke
Hi Alexandre, nice to meet you here ;-) With 3 hosts only you can't survive an full node failure, because for that you need host >= k + m. And k:1 m:2 don't make any sense. I start with 5 hosts and use k:3, m:2. In this case two hdds can fail or one host can be down for maintenance. Udo PS: yo

Re: [ceph-users] OSD capacity variance ?

2015-02-01 Thread Udo Lembke
Hi Howard, I assume it's an typo with 160 + 250 MB. Ceph OSDs must be min. 10GB to get an weight of 0.01 Udo On 31.01.2015 23:39, Howard Thomson wrote: > Hi All, > > I am developing a custom disk storage backend for the Bacula backup > system, and am in the process of setting up a trial Ceph syst

Re: [ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Udo Lembke
e so I > can’t verify if it’s disabled at the librbd level on the client. If > you mean on the storage nodes I’ve had some issues dumping the config. > Does the rbd caching occur on the storage nodes, client, or both? > > > > > > *From:*Udo Lembke [mailto:ulem...@polarzone

Re: [ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Udo Lembke
Hi Bruce, hmm, sounds for me like the rbd cache. Can you look, if the cache is realy disabled in the running config with ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep cache Udo On 30.01.2015 21:51, Bruce McFarland wrote: > > I have a cluster and have created a rbd device -

Re: [ceph-users] Sizing SSD's for ceph

2015-01-29 Thread Udo Lembke
Hi, Am 29.01.2015 07:53, schrieb Christian Balzer: > On Thu, 29 Jan 2015 01:30:41 + Ramakrishna Nishtala (rnishtal) wrote: >> * Per my understanding once writes are complete to journal then >> it is read again from the journal before writing to data disk. Does this >> mean, we have to

Re: [ceph-users] slow read-performance inside the vm

2015-01-27 Thread Udo Lembke
Hi Patrik, Am 27.01.2015 14:06, schrieb Patrik Plank: > > ... > I am really happy, these values above are enough for my little amount of > vms. Inside the vms I get now for write 80mb/s and read 130mb/s, with > write-cache enabled. > > But there is one little problem. > > Are there some tuning

Re: [ceph-users] Better way to use osd's of different size

2015-01-16 Thread Udo Lembke
Hi Megov, you should weight the OSD so it's represent the size (like an weight of 3.68 for an 4TB HDD). cephdeploy do this automaticly. Nevertheless also with the correct weight the disk was not filled in equal distribution. For that purposes you can use reweight for single OSDs, or automaticly wi

Re: [ceph-users] Part 2: ssd osd fails often with "FAILED assert(soid < scrubber.start || soid >= scrubber.end)"

2015-01-16 Thread Udo Lembke
Hi Loic, thanks for the answer. I hope it's not like in http://tracker.ceph.com/issues/8747 where the issue happens with an patched version if understand right. So I must only wait few month ;-) for an backport... Udo Am 14.01.2015 09:40, schrieb Loic Dachary: > Hi, > > This is http://tracker.c

[ceph-users] Part 2: ssd osd fails often with "FAILED assert(soid < scrubber.start || soid >= scrubber.end)"

2015-01-14 Thread Udo Lembke
Hi again, sorry for not threaded, but my last email don't came back on the mailing list (often miss some posts!). Just after sending the last mail, the first time another SSD fails - in this case an cheap one, but with the same error: root@ceph-04:/var/log/ceph# more ceph-osd.62.log 2015-01-13 16

[ceph-users] ssd osd fails often with "FAILED assert(soid < scrubber.start || soid >= scrubber.end)"

2015-01-13 Thread Udo Lembke
Hi, since last thursday we had an ssd-pool (cache tier) in front of an ec-pool and fill the pools with data via rsync (app. 50MB/s). The ssd-pool has tree disks and one of them (an DC S3700) fails four times since that. I simply start the osd again and the pool pas rebuilded and work again for some

Re: [ceph-users] backfill_toofull, but OSDs not full

2015-01-09 Thread Udo Lembke
Hi, I had an similiar effect two weeks ago - 1PG backfill_toofull and due reweighting and delete there was enough free space but the rebuild process stopped after a while. After stop and start ceph on the second node, the rebuild process runs without trouble and the backfill_toofull are gone. Thi

Re: [ceph-users] Improving Performance with more OSD's?

2015-01-04 Thread Udo Lembke
Hi Lindsay, On 05.01.2015 06:52, Lindsay Mathieson wrote: > ... > So two OSD Nodes had: > - Samsung 840 EVO SSD for Op. Sys. > - Intel 530 SSD for Journals (10GB Per OSD) > - 3TB WD Red > - 1 TB WD Blue > - 1 TB WD Blue > - Each disk weighted at 1.0 > - Primary affinity of the WD Red (slow) set to

Re: [ceph-users] Is there an negative relationship between storage utilization and ceph performance?

2015-01-02 Thread Udo Lembke
14 20:49:02 +0100 Udo Lembke wrote: > >> Hi, >> since a long time I'm looking for performance improvements for our >> ceph-cluster. >> The last expansion got better performance, because we add another node >> (with 12 OSDs). The storage utilization was after tha

Re: [ceph-users] Any Good Ceph Web Interfaces?

2014-12-23 Thread Udo Lembke
Hi, for monitoring only I use the Ceph Dashboard https://github.com/Crapworks/ceph-dash/ Fo me it's an nice tool for an good overview - for administration i use the cli. Udo On 23.12.2014 01:11, Tony wrote: > Please don't mention calamari :-) > > The best web interface for ceph that actually wo

Re: [ceph-users] v0.90 released

2014-12-23 Thread Udo Lembke
Hi Sage, Am 23.12.2014 15:39, schrieb Sage Weil: ... > > You can't reduce the PG count without creating new (smaller) pools > and migrating data. does this also work with the pool metadata, or is this pool essential for ceph? Udo ___ ceph-users maili

Re: [ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-20 Thread Udo Lembke
s me... I > intended to try those changes over the holidays... > > > Found it; the subject was "ceph osd crush tunables optimal AND add new > OSD at the same time". > > > On Sat, Dec 20, 2014 at 3:26 AM, Udo Lembke <mailto:ulem...@polarzone.de>> wrote: &g

Re: [ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-20 Thread Udo Lembke
ld an "chooseleaf_vary_r 1" (from 0) take round about the same time to finished?? Regards Udo On 04.12.2014 14:09, Udo Lembke wrote: > Hi, > to answer myself. > > With ceph osd crush show-tunables I see a little bit more, but doesn't > know how far away from firefly-tun

Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread Udo Lembke
Hi Lindsay, have you tried the different cache-options (no cache, write through, ...) which proxmox offer, for the drive? Udo On 18.12.2014 05:52, Lindsay Mathieson wrote: > I'be been experimenting with CephFS for funning KVM images (proxmox). > > cephfs fuse version - 0.87 > > cephfs kernel mod

[ceph-users] Any tuning of LVM-Storage inside an VM related to ceph?

2014-12-18 Thread Udo Lembke
Hi all, I have some fileserver with insufficient read speed. Enabling read ahead inside the VM improve the read speed, but it's looks, that this has an drawback during lvm-operations like pvmove. For test purposes, I move the lvm-storage inside an VM from vdb to vdc1. It's take days, because it's

Re: [ceph-users] Help with SSDs

2014-12-18 Thread Udo Lembke
Hi Mark, On 18.12.2014 07:15, Mark Kirkwood wrote: > While you can't do much about the endurance lifetime being a bit low, > you could possibly improve performance using a journal *file* that is > located on the 840's (you'll need to symlink it - disclaimer - have > not tried this myself, but wil

Re: [ceph-users] Help with SSDs

2014-12-17 Thread Udo Lembke
Hi Mikaël, > > I have EVOs too, what to you mean by "not playing well with D_SYNC"? > Is there something I can test on my side to compare results with you, > as I have mine flashed? http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ described

Re: [ceph-users] Multiple issues :( Ubuntu 14.04, latest Ceph

2014-12-15 Thread Udo Lembke
Hi, see here: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg15546.html Udo On 16.12.2014 05:39, Benjamin wrote: > I increased the OSDs to 10.5GB each and now I have a different issue... > > cephy@ceph-admin0:~/ceph-cluster$ echo {Test-data} > testfile.txt > cephy@ceph-admin0:~/ceph-cl

Re: [ceph-users] Multiple issues :( Ubuntu 14.04, latest Ceph

2014-12-15 Thread Udo Lembke
Hi Benjamin, On 15.12.2014 03:31, Benjamin wrote: > Hey there, > > I've set up a small VirtualBox cluster of Ceph VMs. I have one > "ceph-admin0" node, and three "ceph0,ceph1,ceph2" nodes for a total of 4. > > I've been following this > guide: http://ceph.com/docs/master/start/quick-ceph-deploy/ to

[ceph-users] For all LSI SAS9201-16i user - don't upgrate to firmware P20

2014-12-11 Thread Udo Lembke
Hi all, I have upgrade two LSI SAS9201-16i HBAs to the latest Firmware P20.00.00 and after that I got following syslog messages: Dec 9 18:11:31 ceph-03 kernel: [ 484.602834] mpt2sas0: log_info(0x3108): originator(PL), code(0x08), sub_code(0x) Dec 9 18:12:15 ceph-03 kernel: [ 528.31017

Re: [ceph-users] Old OSDs on new host, treated as new?

2014-12-05 Thread Udo Lembke
Hi, perhaps an stupid question, but why you change the hostname? Not tried, but I guess if you boot the node with an new hostname, the old hostname are in the crush map, but without any OSDs - because they are on the new host. Don't know ( I guess not) if the degration level stay also on 5% if you

Re: [ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-04 Thread Udo Lembke
": 0, "legacy_tunables": 0, "require_feature_tunables": 1, "require_feature_tunables2": 0} Look this like argonaut or bobtail? And how proceed to update? Does in makes sense first go to profile bobtail and then to firefly? Regards Udo Am 01.12.2014 17:39,

[ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-01 Thread Udo Lembke
Hi all, http://ceph.com/docs/master/rados/operations/crush-map/#crush-tunables described how to set the tunables to legacy, argonaut, bobtail, firefly or optimal. But how can I see, which profile is active in an ceph-cluster? With "ceph osd getcrushmap" I got not realy much info (only "tunable ch

Re: [ceph-users] Typical 10GbE latency

2014-11-12 Thread Udo Lembke
Hi Wido, On 12.11.2014 12:55, Wido den Hollander wrote: > (back to list) > > > Indeed, there must be something! But I can't figure it out yet. Same > controllers, tried the same OS, direct cables, but the latency is 40% > higher. > > perhaps something with pci-e order / interupts? have you checked

Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Udo Lembke
ation on the host? > Thanks. > > Thu Nov 06 2014 at 16:57:36, Udo Lembke <mailto:ulem...@polarzone.de>>: > > Hi, > from one host to five OSD-hosts. > > NIC Intel 82599EB; jumbo-frames; single Switch IBM G8124 (blade > network). > > rtt m

Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Udo Lembke
Hi, from one host to five OSD-hosts. NIC Intel 82599EB; jumbo-frames; single Switch IBM G8124 (blade network). rtt min/avg/max/mdev = 0.075/0.114/0.231/0.037 ms rtt min/avg/max/mdev = 0.088/0.164/0.739/0.072 ms rtt min/avg/max/mdev = 0.081/0.141/0.229/0.030 ms rtt min/avg/max/mdev = 0.083/0.115/0

[ceph-users] Is there an negative relationship between storage utilization and ceph performance?

2014-11-04 Thread Udo Lembke
Hi, since a long time I'm looking for performance improvements for our ceph-cluster. The last expansion got better performance, because we add another node (with 12 OSDs). The storage utilization was after that 60%. Now we reach again 69% (the next nodes are waiting for installation) and the perfo

Re: [ceph-users] question about activate OSD

2014-10-31 Thread Udo Lembke
Hi German, if i'm right the journal-creation on /dev/sdc1 failed (perhaps because you only say /dev/sdc instead of /dev/sdc1?). Do you have partitions on sdc? Udo On 31.10.2014 22:02, German Anders wrote: > Hi all, > I'm having some issues while trying to activate a new osd in a > new clu

Re: [ceph-users] Replacing a disk: Best practices?

2014-10-16 Thread Udo Lembke
Am 15.10.2014 22:08, schrieb Iban Cabrillo: > HI Cephers, > > I have an other question related to this issue, What would be the > procedure to restore a server fail (a whole server for example due to a > mother board trouble with no damage on disk). > > Regards, I > Hi, - change serverboard. -

Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-24 Thread Udo Lembke
Hi again, sorry - forgot my post... see osdmap e421: 9 osds: 9 up, 9 in shows that all your 9 osds are up! Do you have trouble with your journal/filesystem? Udo Am 25.09.2014 08:01, schrieb Udo Lembke: > Hi, > looks that some osds are down?! > > What is the output of "ceph o

  1   2   >