Re: [ceph-users] cache tiering deprecated in RHCS 2.0

2016-10-24 Thread Christian Balzer
Hello, On Mon, 24 Oct 2016 11:49:15 +0200 Dietmar Rieder wrote: > On 10/24/2016 03:10 AM, Christian Balzer wrote: > > [...] > > There are several items here and I very much would welcome a response from > > a Ceph/RH representative. > > > > 1. Is that depreci

Re: [ceph-users] SSS Caching

2016-10-26 Thread Christian Balzer
lication, especially if the individual SSDs are large and/or your network is slow (time to recovery). Christian > 3/ Anything I should be aware of when looking into caching? > > Thanks for your time!, > Ashley > ___ > ceph-users m

Re: [ceph-users] SSS Caching

2016-10-27 Thread Christian Balzer
Hello, On Thu, 27 Oct 2016 11:30:29 +0200 Steffen Weißgerber wrote: > > > > >>> Christian Balzer schrieb am Donnerstag, 27. Oktober 2016 > >>> um > 04:07: > > Hi, > > > Hello, > > > > On Wed, 26 Oct 2016 15:40:00 +

Re: [ceph-users] Some query about using "bcache" as backend of Ceph

2016-10-27 Thread Christian Balzer
th "cache tier/layer", > how to benchmark under this kind of design? Because if we can not find a > scenario to get a stable benchmark IOPS result, we can not evaluate the > impact of configuration/code change of ceph on the ceph performance > (sometimes IOPS resul

Re: [ceph-users] Hammer Cache Tiering

2016-11-01 Thread Christian Balzer
used data, waiting for Jewel might be a better proposition. Of course the lack of any official response to the last relevant thread here about the future of cache tiering makes adding/designing a cache tier an additional challenge... Christian -- Christian BalzerNetwork/Systems Engineer

Re: [ceph-users] Hammer Cache Tiering

2016-11-01 Thread Christian Balzer
; > From: Christian Wuerdig [mailto:christian.wuer...@gmail.com] > Sent: Wednesday, 2 November 2016 12:57 PM > To: Ashley Merrick > Cc: Christian Balzer ; ceph-us...@ceph.com > Subject: Re: [ceph-users] Hammer Cache Tiering > > > > On Wed, Nov 2, 2016 at 5:19 PM,

Re: [ceph-users] Replication strategy, write throughput

2016-11-06 Thread Christian Balzer
30G --direct=1 --sync=1 --iodepth=128 > --filename=/dev/sdw gives about 200 MB/s (test for journal writes) > [4] Test with iperf3, 1 storage node connects to 2 other nodes to the > backend IP gives 10 Gbit/s throughput for each connection > > > Thanks, > Andreas > ___

Re: [ceph-users] RBD Block performance vs rbd mount as filesystem

2016-11-06 Thread Christian Balzer
r" > > osd_recovery_op_priority = 4 > > osd_recovery_max_active = 10 > > osd_max_backfills = 4 > > rbd non blocking aio = false > > > > [client] > > rbd_cache = true > > rbd_cache_size = 268435456 > > rbd_cache_max_dirty = 134217728

Re: [ceph-users] Replication strategy, write throughput

2016-11-08 Thread Christian Balzer
On Tue, 8 Nov 2016 08:55:47 +0100 Andreas Gerstmayr wrote: > 2016-11-07 3:05 GMT+01:00 Christian Balzer : > > > > Hello, > > > > On Fri, 4 Nov 2016 17:10:31 +0100 Andreas Gerstmayr wrote: > > > >> Hello, > >> > >> I'd like t

Re: [ceph-users] Replication strategy, write throughput

2016-11-09 Thread Christian Balzer
ntial writes at full speed, not so much. > > Isn't more distributed I/O always favorable? Or is the problem the 4x > overhead (1MB vs 4MB)? > As I said, in general yes. And since you're basically opening the flood gates you're not seeing a difference between str

Re: [ceph-users] - cluster stuck and undersized if at least one osd is down

2016-11-29 Thread Christian Balzer
10 weight 3.636 > >>> item osd.11 weight 3.636 > >>> } > >>> root default { > >>> id -1 # do not change unnecessarily > >>> # weight 43.637 > >>>

Re: [ceph-users] Migrate OSD Journal to SSD

2016-12-01 Thread Christian Balzer
al -> /dev/disk/by-id/wwn-0x55cd2e404b73d570-part4 --- > ceph-osd -i osd.$i -mkjournal > service ceph start osd.$i > ceph osd unset noout > > Does this logic appear to hold up? > Yup. Christian > Appreciate the help. > > Thanks, > > Reed -- Ch

Re: [ceph-users] is Ceph suitable for small scale deployments?

2016-12-05 Thread Christian Balzer
. Does > that mean that all writes have to sync over all the nodes in the Ceph > cluster, before the write can be considered complete? Or is one node > enough? > Al nodes by default. Christian > /Joakim > > ___ > ceph-users mai

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-05 Thread Christian Balzer
s. > > So. Two questions: > > - any hint (beside from meticuluously reading the source) on interpreting > those slow request messages in detail? > - specifically the “waiting for rw locks” is something that’s new to us - can > someone enlighten me that it m

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-05 Thread Christian Balzer
t; the future. I find slow requests extremely hard to debug and as I said: > aside from scratching my own itch, I’d be happy to help future > travellers. > > > On 6 Dec 2016, at 00:59, Christian Balzer wrote: > > > > Hello, > > > > On Mon, 5 Dec 2016 15:25:37

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-06 Thread Christian Balzer
Hello, On Tue, 6 Dec 2016 11:14:59 -0600 Reed Dier wrote: > > > On Dec 5, 2016, at 9:42 PM, Christian Balzer wrote: > > > > > > Hello, > > > > On Tue, 6 Dec 2016 03:37:32 +0100 Christian Theune wrote: > > > >> Hi Christian (heh), >

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-06 Thread Christian Balzer
Hello, On Tue, 6 Dec 2016 20:58:52 +0100 Christian Theune wrote: > Hi, > > > On 6 Dec 2016, at 04:42, Christian Balzer wrote: > > Jewel issues, like the most recent one with scrub sending OSDs to > > neverland. > > Alright. We’re postponing this for now. Is t

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Christian Balzer
often data is lost > > and/or corrupted which causes even more problems. > > > > I can't stress this enough. Running with size = 2 in production is a > > SERIOUS hazard and should not be done imho. > > > > To anyone out there running with siz

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Christian Balzer
Hello, On Wed, 7 Dec 2016 09:04:37 +0100 Christian Theune wrote: > Hi, > > > On 7 Dec 2016, at 05:14, Christian Balzer wrote: > > > > Hello, > > > > On Tue, 6 Dec 2016 20:58:52 +0100 Christian Theune wrote: > > > >> Alright. We’re postponi

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Christian Balzer
sums like in ZFS or hopefully in Bluestore. The current scrubbing in Ceph is a pretty weak defense against it unless it's 100% clear from drive SMART checks which one the bad source is. Christian > > 7 дек. 2016 г., в 13:35, Christian Balzer написал(а): > > > > > >

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Christian Balzer
0.000.04 0.04 79.20 > | > | avg-cpu: %user %nice %system %iowait %steal %idle > |4.910.033.523.390.00 88.15 > | > | Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %ut

Re: [ceph-users] What happens if all replica OSDs journals are broken?

2016-12-12 Thread Christian Balzer
d- > journal-failure/). > > Does this also work in this case? > Not really, no. The above works by having still a valid state and operational OSDs from which the "broken" one can recover. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.c

Re: [ceph-users] What happens if all replica OSDs journals are broken?

2016-12-13 Thread Christian Balzer
> > 2016-12-13 0:00 GMT+01:00 Christian Balzer : > > > On Mon, 12 Dec 2016 22:41:41 +0100 Kevin Olbrich wrote: > > > > > Hi, > > > > > > just in case: What happens when all replica journal SSDs are broken at > > once? > > > > > T

Re: [ceph-users] [Fixed] OS-Prober In Ubuntu Xenial causes journal errors

2016-12-14 Thread Christian Balzer
achine of mine, as it has a long history of doing unwanted and outright bad things. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ ___ ce

Re: [ceph-users] fio librbd result is poor

2016-12-18 Thread Christian Balzer
or the architecture is the unbalance weight between > three racks which one rack has only one storage node. > > > > > So can anybody tell us whether this number is reasonable.If not,any > suggestion to improve the number will be appreciated. > > >

Re: [ceph-users] fio librbd result is poor

2016-12-18 Thread Christian Balzer
Hello, On Mon, 19 Dec 2016 15:05:05 +0800 (CST) mazhongming wrote: > Hi Christian, > Thanks for your reply. > > > At 2016-12-19 14:01:57, "Christian Balzer" wrote: > > > >Hello, > > > >On Mon, 19 Dec 2016 13:29:07 +0800 (CST) 马忠明 wrote: >

Re: [ceph-users] Unwanted automatic restart of daemons during an upgrade since 10.2.5 (on Trusty)

2016-12-19 Thread Christian Balzer
gt; - Ken > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine J

[ceph-users] When Zero isn't 0 (Crush weight mysteries)

2016-12-20 Thread Christian Balzer
the CRUSH algorithm goes around and pulls weights out of thin air? Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ ___ ceph-users mailing

Re: [ceph-users] When Zero isn't 0 (Crush weight mysteries)

2016-12-20 Thread Christian Balzer
g a floating point rounding error of sorts, which also explains an OSD set to a weight of 0.65 to be listed as "0.64999". Christian > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > > Christian Balze

Re: [ceph-users] Read Only Cache Tier

2016-12-21 Thread Christian Balzer
che-tier, write-back is your only valid option. With careful tuning you should be able to avoid having writes go to the cache for objects that aren't already in it. Christian -- Christian BalzerNetwork/Systems Engineer ch...@go

Re: [ceph-users] When Zero isn't 0 (Crush weight mysteries)

2016-12-21 Thread Christian Balzer
Hello, On Wed, 21 Dec 2016 11:33:48 +0100 (CET) Wido den Hollander wrote: > > > Op 21 december 2016 om 2:39 schreef Christian Balzer : > > > > > > > > Hello, > > > > I just (manually) added 1 OSD each to my 2 cache-tier nodes. > > The p

Re: [ceph-users] Read Only Cache Tier

2016-12-21 Thread Christian Balzer
y, Ceph is known to be "slow" when doing sequential reads, google "ceph readahead". Also without specifying otherwise, your results are also skewed by the pagecache and don't necessarily reflect actual Ceph performance. Christian > Best regards, > > On De

Re: [ceph-users] osd' balancing question

2017-01-03 Thread Christian Balzer
eph-12 > /dev/sdh1889G 759G 131G 86% > /var/lib/ceph/osd/ceph-14 > /dev/sdi1889G 763G 127G 86% > /var/lib/ceph/osd/ceph-16 > /dev/sdj1889G 732G 158G 83% > /var/lib/ceph/osd/ceph-18 > /dev

Re: [ceph-users] Why is there no data backup mechanism in the rados layer?

2017-01-03 Thread Christian Balzer
bug bringing down your live and backup data. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@list

Re: [ceph-users] osd' balancing question

2017-01-03 Thread Christian Balzer
ve about versions, restart shouldn't be needed but then again recent experiences do suggest that the "Windows approach" (turning it off and on again) seems to help with Ceph at times, too. Christian > Many Thanks . > > > > > > > > *Yair Magn

Re: [ceph-users] When Zero isn't 0 (Crush weight mysteries)

2017-01-03 Thread Christian Balzer
Hello, On Tue, 3 Jan 2017 16:52:16 -0800 Gregory Farnum wrote: > On Wed, Dec 21, 2016 at 2:33 AM, Wido den Hollander wrote: > > > >> Op 21 december 2016 om 2:39 schreef Christian Balzer : > >> > >> > >> > >> Hello, > >> > >&g

Re: [ceph-users] osd' balancing question

2017-01-03 Thread Christian Balzer
ccept connect_seq 12 vs existing 11 state standby > > Thanks > > > > *Yair Magnezi * > > > > > *Storage & Data Protection TL // KenshooOffice +972 7 32862423 // > Mobile +972 50 575-2955__* > > > > On T

Re: [ceph-users] Is this a deadlock?

2017-01-04 Thread Christian Balzer
art osd.33 /etc/init.d/ceph: osd.33 not found (/etc/ceph/ceph.conf defines mon.engtest03 mon.engtest04 mon.engtest05 mon.irt03 mon.irt04 mds.engtest03 osd.20 osd.21 osd.22 osd.23, /var/lib/ceph defines ) --- Christian -- Christian BalzerNetwork/Systems Engineer ch...

Re: [ceph-users] cephfs ata1.00: status: { DRDY }

2017-01-05 Thread Christian Balzer
> So if scrub is eating away all IO, the scrub algorythem is simply too > aggressiv. > > Or, and thats most probable i guess, i have some kind of config mistake. > > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Jap

Re: [ceph-users] Analysing ceph performance with SSD journal, 10gbe NIC and 2 replicas -Hammer release

2017-01-05 Thread Christian Balzer
y least SSDs with 3+ DWPD endurance like the DC S3610s. In very light loaded cases DC S3520 with 1DWPD may be OK, but again, you need to know what you're doing here. Christian > > Can somebody help me understand this better. > > Regards, > Kevin -- Christian Balzer

Re: [ceph-users] Stability Issue with 52 OSD hosts

2018-08-22 Thread Christian Balzer
d be very telling, collecting and graphing this data might work, too. My suspects would be deep scrubs and/or high IOPS spikes when this is happening, starving out OSD processes (CPU wise, RAM should be fine one supposes). Christian > Please help!!! > ______

Re: [ceph-users] ceph auto repair. What is wrong?

2018-08-24 Thread Christian Balzer
943%), > > 74 pgs degraded, 74 pgs undersized > > > > And ceph does not try to repair pool. Why? > > How long did you wait? The default timeout is 600 seconds before > recovery starts. > > These OSDs are not marked as out yet. > > Wido > > >

Re: [ceph-users] Design a PetaByte scale CEPH object storage

2018-08-26 Thread Christian Balzer
torage per node, SAN and FC are anathema and NVMe is likely not needed in your scenario, at least not for actual storage space. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Best practices for allocating memory to bluestore cache

2018-08-30 Thread Christian Balzer
received this > > message in error, please contact the sender and destroy all copies of this > > email and any attachment(s). > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/

Re: [ceph-users] Bluestore vs. Filestore

2018-10-02 Thread Christian Balzer
ves? > > Thoughts? > > Jesper > > * Bluestore should be the new and shiny future - right? > ** Total mem 1TB+ > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.c

Re: [ceph-users] Luminous with osd flapping, slow requests when deep scrubbing

2018-10-16 Thread Christian Balzer
bdev = 0 > debug_rocksdb = 0 > > > Could you share experiences with deep scrubbing of bluestore osds? Are there > any options that I should set to make sure the osds are not flapping and the > client IO is still available? > > Thanks > > Andrei -- Christi

Re: [ceph-users] Luminous with osd flapping, slow requests when deep scrubbing

2018-10-16 Thread Christian Balzer
Hello, On Tue, 16 Oct 2018 14:09:23 +0100 (BST) Andrei Mikhailovsky wrote: > Hi Christian, > > > - Original Message - > > From: "Christian Balzer" > > To: "ceph-users" > > Cc: "Andrei Mikhailovsky" > > Sen

Re: [ceph-users] New Ceph cluster design

2018-03-12 Thread Christian Balzer
srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0.0383.097.07 303.24 746.64 5084.9937.59 0.050.150.710.13 0.06 2.00 --- 300 write IOPS and 5MB/s for all that time. Christian -- Christian BalzerNetwork

Re: [ceph-users] Disk write cache - safe?

2018-03-15 Thread Christian Balzer
nage to IT mode style exposure of the disks and still use their HW cache. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@list

Re: [ceph-users] Growing an SSD cluster with different disk sizes

2018-03-19 Thread Christian Balzer
e twice the action, are they a) really twice as fast or b) is your load never going to be an issue anyway? Christian > I'm also using Luminous/Bluestore if it matters. > > Thanks in advance! > > *Mark Steffen* > *"Don't believe everything you read on the Internet

Re: [ceph-users] Growing an SSD cluster with different disk sizes

2018-03-19 Thread Christian Balzer
I really don't and don't expect to). > Understood, thank you! > > *Mark Steffen* > *"Don't believe everything you read on the Internet." -Abraham Lincoln* > > > > On Mon, Mar 19, 2018 at 7:11 AM, Christian Balzer wrote: > > > > > H

Re: [ceph-users] Fwd: High IOWait Issue

2018-03-25 Thread Christian Balzer
IOwait on each ceph host is about 20%. > > > https://prnt.sc/ivne08 > > > > > > > > > Can you guy help me find the root cause of this issue, and how > > to eliminate this high iowait? > > > > > > Thanks in a

Re: [ceph-users] problem while removing images

2018-03-26 Thread Christian Balzer
gt; > Any ideas? > > Regards > > > *Thiago Gonzaga* > SaaSOps Software Architect > o. 1 (512) 2018-287 x2119 > Skype: thiago.gonzaga20 > [image: Aurea] > <http://www.aurea.com/?utm_source=email-signature&utm_medium=email> -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fwd: Fwd: High IOWait Issue

2018-03-26 Thread Christian Balzer
s to configure things on either host or switch, or with a good modern switch not even buy that much in the latency department. Christian > > 2018-03-26 7:41 GMT+07:00 Christian Balzer : > > > > > Hello, > > > > in general and as reminder for others, the more informati

[ceph-users] Bluestore caching, flawed by design?

2018-03-29 Thread Christian Balzer
SDs. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Bluestore caching, flawed by design?

2018-04-01 Thread Christian Balzer
Hello, firstly, Jack pretty much correctly correlated my issues to Mark's points, more below. On Sat, 31 Mar 2018 08:24:45 -0500 Mark Nelson wrote: > On 03/29/2018 08:59 PM, Christian Balzer wrote: > > > Hello, > > > > my crappy test cluster was rendered inope

Re: [ceph-users] Bluestore caching, flawed by design?

2018-04-02 Thread Christian Balzer
R) CPU E5-1650 v3 @ 3.50GHz (12/6 cores) Christian > I haven't been following these processors lately. Is anyone building CEPH > clusters using them > > On 2 April 2018 at 02:59, Christian Balzer wrote: > > > > > Hello, > > > > firstly, Jack pretty muc

Re: [ceph-users] Cluster unusable after 50% full, even with index sharding

2018-04-13 Thread Christian Balzer
a limitation of Ceph that getting 50% full makes your cluster > unusable? Index sharding has seemed to not help at all (I did some > benchmarking, with 128 shards and then 256; same result each time.) > > Or are we out of luck? -- Christian BalzerNet

Re: [ceph-users] Questions regarding hardware design of an SSD only cluster

2018-04-23 Thread Christian Balzer
as usual and 1-2GB extra per OSD. > Regards, > > Florian > _______ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian BalzerNetwork/Systems Engineer

Re: [ceph-users] Questions regarding hardware design of an SSD only cluster

2018-04-24 Thread Christian Balzer
Hello, On Tue, 24 Apr 2018 11:39:33 +0200 Florian Florensa wrote: > 2018-04-24 3:24 GMT+02:00 Christian Balzer : > > Hello, > > > > Hi Christian, and thanks for your detailed answer. > > > On Mon, 23 Apr 2018 17:43:03 +0200 Florian Florensa wrote: > >

Re: [ceph-users] Poor read performance.

2018-04-25 Thread Christian Balzer
and filestore backend) but used a different perf tool > so don't want to make direct comparisons. > It could be as easy as having lots of pagecache with filestore that helped dramatically with (repeated) reads. But w/o a quiescent cluster determining things might be difficult. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Poor read performance.

2018-04-25 Thread Christian Balzer
Hello, On Wed, 25 Apr 2018 17:20:55 -0400 Jonathan Proulx wrote: > On Wed Apr 25 02:24:19 PDT 2018 Christian Balzer wrote: > > > Hello, > > > On Tue, 24 Apr 2018 12:52:55 -0400 Jonathan Proulx wrote: > > > > The performence I really care about is over rbd

Re: [ceph-users] Public network faster than cluster network

2018-05-09 Thread Christian Balzer
he 1Gb/s links. Lastly, more often than not segregated networks are not needed, add unnecessary complexity and the resources spent on them would be better used to have just one fast and redundant network instead. Christian -- Christian BalzerNetwork/Systems Engineer

Re: [ceph-users] Public network faster than cluster network

2018-05-10 Thread Christian Balzer
saturating your disks with IOPS long before bandwidth becomes an issue. > thus, a 10GB network would be needed, right ? Maybe a dual gigabit port > bonded together could do the job. > A single gigabit link would be saturated by a single disk. > > Is my assumption correct ? > T

Re: [ceph-users] RBD v1 image format ...

2017-01-11 Thread Christian Balzer
time conversion tool from v1 to v2 would significantly reduce the number of people who are likely to take issue with this. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/R

Re: [ceph-users] HEALTH_OK when one server crashed?

2017-01-12 Thread Christian Balzer
ports, disk usage, SMART wear out levels of SSDs down to the individual processes you'd expect to see running on a node: "PROCS OK: 8 processes with command name 'ceph-osd' " I lost single OSDs a few times and didn't notice either by looking at Nagios as the recov

Re: [ceph-users] Why would "osd marked itself down" will not recognised?

2017-01-12 Thread Christian Balzer
gt; >>>>>> failed (2 > >>>>>> reporters from different host after 21.222945 >= grace 20.388836) > >>>>>> 2017-01-12 10:18:39.681221 mon.0 [INF] osd.5 10.132.7.12:6802/4163 > >>>>>> failed (2 > >>>>>> reporters from differ

[ceph-users] Inherent insecurity of OSD daemons when using only a "public network"

2017-01-13 Thread Christian Balzer
respective configuration. The above is with Hammer, any version. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ ___ ceph-users mailing

Re: [ceph-users] Ceph Network question

2017-01-13 Thread Christian Balzer
there not being any security bugs. > > > > If you only have one 10Gb connection, perhaps consider separate VLANs? > > > > Oliver. > > > > > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communicat

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Christian Balzer
fications of the cluster, it can do > >>> it with no problems. > >> > >> A few tips: > >> > >> - Disable all logging in Ceph (debug_osd, debug_ms, debug_auth, etc, > >> etc) > > > > All logging is configured to default settings, should those be turned

Re: [ceph-users] slow requests break performance

2017-02-01 Thread Christian Balzer
nds > >>>> old, received at 2017-01-11 [...] ack+ondisk+write+known_if_redirected > >>>> e12440) currently waiting for subops from 0,12 > >>>> > >>>> I assumed that osd.16 is the one causing problems. > >>> > >>>

Re: [ceph-users] slow requests break performance

2017-02-01 Thread Christian Balzer
92.168.160.0/24 > osd_pool_default_size = 3 > osd pool default min size = 2 > osd_crush_chooseleaf_type = 1 > mon_pg_warn_max_per_osd = 0 > auth_cluster_required = cephx > auth_service_required = cephx > auth_client_required = cephx > filestore_xattr_use_omap = true &

Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-02 Thread Christian Balzer
; > > Maxime > > > > _______ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- Christian BalzerNetwork/Systems Engineer

Re: [ceph-users] slow requests break performance

2017-02-03 Thread Christian Balzer
a SSD would need sufficient endurance (you have to determine that, but something like a DC S3610 comes to mind), but not masive speed with this kind of network. And if you can afford only one per node, it would also become a nice SPoF. Christian > Eugen > > Zitat von Christian Balzer

Re: [ceph-users] Why is bandwidth not fully saturated?

2017-02-05 Thread Christian Balzer
0 : 0 0 > 4973B 3251B: 66B0 | 023k: 0 0 : 0 0 : 0 0 > 22k 1752B: 66B0 | 041k: 0 0 : 0 0 : 0 0 > 733B 1267B: 66B0 | 0 0 : 0 0 : 0 0 : 0 0 > 1580B 1900B: 66B0 | 0 0 : 0

Re: [ceph-users] Anyone using LVM or ZFS RAID1 for boot drives?

2017-02-12 Thread Christian Balzer
chev > Storcium > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian BalzerNetwork/Systems Engineer

Re: [ceph-users] RBD client newer than cluster

2017-02-14 Thread Christian Balzer
> ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.c

Re: [ceph-users] bcache vs flashcache vs cache tiering

2017-02-14 Thread Christian Balzer
I think both approaches have different strengths and probably the difference > between a tiering system and a caching one is what causes some of the > problems. > > If something like bcache is going to be the preferred approach, then I think > more work needs to be done around certi

Re: [ceph-users] bcache vs flashcache vs cache tiering

2017-02-15 Thread Christian Balzer
; removed as part of a new/different rados tiering function in rados is > really a function of how the code refactor works out and how difficult it > is to support vs the use cases it covers that the new tiering does not. > > sage > -- Christian BalzerNetwork/Syste

Re: [ceph-users] Jewel + kernel 4.4 Massive performance regression (-50%)

2017-02-19 Thread Christian Balzer
> - Journals are on the disk. > >> > > >> > bench5 : Ubuntu 14.04 / Ceph Infernalis > >> > bench6 : Ubuntu 14.04 / Ceph Jewel > >> > bench7 : Ubuntu 16.04 / Ceph jewel > >> > > >> >

Re: [ceph-users] Jewel + kernel 4.4 Massive performance regression (-50%)

2017-02-20 Thread Christian Balzer
test again with just the EXT4 node active. And this time 4.9 came out (slightly) ahead: 3645 IOPS 3.16 3970 IOPS 4.9 --- Christian On Mon, 20 Feb 2017 13:10:38 +0900 Christian Balzer wrote: > Hello, > > On Thu, 16 Feb 2017 17:51:18 +0200 Kostis Fardelas wrote: > > > Hell

Re: [ceph-users] How safe is ceph pg repair these days?

2017-02-20 Thread Christian Balzer
; kind of error on two different OSDs is highly improbable. I don't > >> > > understand why ceph repair wouldn't have done this all along. > >> > > > >> > > What is the current best practice in the use of ceph repair? > >> > > &g

Re: [ceph-users] How safe is ceph pg repair these days?

2017-02-20 Thread Christian Balzer
Hello, On Mon, 20 Feb 2017 17:15:59 -0800 Gregory Farnum wrote: > On Mon, Feb 20, 2017 at 4:24 PM, Christian Balzer wrote: > > > > Hello, > > > > On Mon, 20 Feb 2017 14:12:52 -0800 Gregory Farnum wrote: > > > >> On Sat, Feb 18, 2017 at 12:39

Re: [ceph-users] Having many Pools

2017-02-21 Thread Christian Balzer
formulas and PGcalc recommend for your cluster size. And that's obviously not a number that will scale up with the amount of users. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications h

Re: [ceph-users] NVRAM cache for ceph journal

2017-02-21 Thread Christian Balzer
rom some of its data on fast storage (WAL, DB) the exact size requirements are not exactly clear to me at this time. What it definitely won't need are (relatively large) journals. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global

[ceph-users] A Jewel in the rough? (cache tier bugs and documentation omissions)

2017-03-06 Thread Christian Balzer
ewel), nor mentioned in the changelogs and most importantly STILL default to the broken reverse settings above. Anybody coming from Hammer or even starting with Jewel and using cache tiering will be having a VERY bad experience. Christian -- Christian BalzerNetwork/Systems Engineer

Re: [ceph-users] A Jewel in the rough? (cache tier bugs and documentation omissions)

2017-03-06 Thread Christian Balzer
On Tue, 7 Mar 2017 01:44:53 + John Spray wrote: > On Tue, Mar 7, 2017 at 12:28 AM, Christian Balzer wrote: > > > > > > Hello, > > > > It's now 10 months after this thread: > > > > http://www.spinics.net/lists/ceph-users/msg27497.html

Re: [ceph-users] Mix HDDs and SSDs togheter

2017-03-06 Thread Christian Balzer
otify > >>> the sender immediately by e-mail if you have received this e-mail by > >>> mistake and delete this e-mail from your system. If you are not the > >>> intended recipient you are notified that disclosing, copying, distributing > >>> or taking any action

Re: [ceph-users] hammer to jewel upgrade experiences? cache tier experience?

2017-03-06 Thread Christian Balzer
uncommon but i am not sure > if that is the cases. are there many on the list using cache tiering? in > particular, with rbd volumes and clients? what are some of the the > communities' experiences there? > Quite a few people here seem to use cache-tiers, if understood and configured

Re: [ceph-users] MySQL and ceph volumes

2017-03-07 Thread Christian Balzer
sed with the consent of the copyright > owner. If you have received this email by mistake or by breach of the > confidentiality clause, please notify the sender immediately by return email > and delete or destroy all copies of the email. Any confidentiality, privilege > or copyri

Re: [ceph-users] hammer to jewel upgrade experiences? cache tier experience?

2017-03-07 Thread Christian Balzer
[re-adding ML, so others may benefit] On Tue, 7 Mar 2017 13:14:14 -0700 Mike Lovell wrote: > On Mon, Mar 6, 2017 at 8:18 PM, Christian Balzer wrote: > > > On Mon, 6 Mar 2017 19:57:11 -0700 Mike Lovell wrote: > > > > > has anyone on the list done an upgrade from h

[ceph-users] Bogus "inactive" errors during OSD restarts with Jewel

2017-03-08 Thread Christian Balzer
her reflected in any logs, nor true of course (the restarts take a few seconds per OSD and the cluster is fully recovered to HEALTH_OK in 12 seconds or so. But it surely is a good scare for somebody not doing this on a test cluster. Anybody else seeing this? Christian -- Christian Balzer

[ceph-users] Jewel problems with sysv-init and non ceph-deploy (udev trickery) OSDs

2017-03-08 Thread Christian Balzer
ight thing via udev rules. Correct? Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] speed decrease with size

2017-03-12 Thread Christian Balzer
; 3145728000 bytes (3.1 GB) copied, 10.0054 s, 314 MB/s > > dd if=/dev/zero of=/mnt/ext4/output bs=1000k count=5k; rm -f > /mnt/ext4/output; > 5120+0 records in > 5120+0 records out > 524288 bytes (5.2 GB) copied, 24.1971 s, 217 MB/s > > Any suggestions for improving t

Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-12 Thread Christian Balzer
ur biggest cluster was not a 100% > > success, but the problems where relative small and the cluster stayed > > on-line and there where only a few virtual openstack instances that did not > > like the blocked I/O and had to be restarted. > > > > > > -- > > &g

Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-12 Thread Christian Balzer
and > restart the OSDs one-by-one. When they come back up, they will just > automatically switch to running as ceph:ceph. > Though if you have external journals and didn't use ceph-deploy, you're boned with the whole ceph:ceph approach. Christian -- Christian BalzerN

Re: [ceph-users] speed decrease with size

2017-03-13 Thread Christian Balzer
Hello, On Mon, 13 Mar 2017 11:25:15 -0400 Ben Erridge wrote: > On Sun, Mar 12, 2017 at 8:24 PM, Christian Balzer wrote: > > > > > Hello, > > > > On Sun, 12 Mar 2017 19:37:16 -0400 Ben Erridge wrote: > > > > > I am testing attached volum

Re: [ceph-users] total storage size available in my CEPH setup?

2017-03-13 Thread Christian Balzer
973 967 5179 > Email: james.ok...@dialogic.com<mailto:james.ok...@dialogic.com> > Web:www.dialogic.com<http://www.dialogic.com/> - The Network Fuel > Company<http://www.dialogic.com/en/landing/itw.aspx> > > This e-mail is intended only for the named re

Re: [ceph-users] Ceph Bluestore

2017-03-14 Thread Christian Balzer
rs). I found EXT4 a better fit for our needs (just RBD) in all the years I tested and compared it with XFS, but if you want to go down the path of least resistance and have a large pool of people to share your problems with, XFS is your only choice at this time. If your machines are

Re: [ceph-users] Ceph Bluestore

2017-03-15 Thread Christian Balzer
gt; least resistance and have a large pool of people to share your problems > > with, XFS is your only choice at this time. > Why (how) EXT4 got "deprecated" for RGW use? Also could you give me any > comparison between EXT4 and XFS (latency, throughput, etc)? > R

<    5   6   7   8   9   10   11   12   13   >