Hi,
i'm planning a new cluster on a 10GbE network.
Each storage node will have a maximum of 12 SATA disks and 2 SSD as journals.
What do you suggest as journal size for each OSD? 5GB is enough?
Should I just consider SATA writing speed when calculating journal
size or also network speed?
_
frequency is 5 seconds.
What do you mean with fine tuning spinning storage media? On which tuning
are you referring to?
Il giorno 09/lug/2013 23:45, "Andrey Korolyov" ha scritto:
> On Wed, Jul 10, 2013 at 1:16 AM, Gandalf Corvotempesta
> wrote:
> > Hi,
> > i'm
2013/7/12 Mark Nelson :
> At large numbers of PGs it may not matter very much, but I don't think it
> would hurt either!
>
> Basically this has to do with how ceph_stable_mod works. At
> non-power-of-two values, the bucket counts aren't even, but that's only a
> small part of the story and may ult
2013/6/20 Matthew Anderson :
> Hi All,
>
> I've had a few conversations on IRC about getting RDMA support into Ceph and
> thought I would give it a quick attempt to hopefully spur some interest.
> What I would like to accomplish is an RSockets only implementation so I'm
> able to use Ceph, RBD and
I'm looking at some SSDs drives to be used as journal.
Seagate 600 should be the better in write intensive operation (like a journal):
http://www.storagereview.com/seagate_600_pro_enterprise_ssd_review
what do you suggest? Is this good enough ?
Should I look for write-intensive operations when se
2013/7/22 Mark Nelson :
>> http://www.storagereview.com/seagate_600_pro_enterprise_ssd_review
"If you used this SSD for 100% sequential writes, you could
theoretically kill it in a little more than a month."
very bad.
Any other suggestions for SSD device?
_
2013/7/22 Chen, Xiaoxi :
> With “journal writeahead”,the data first write to journal ,ack to the
> client, and write to OSD, note that, the data always keep in memory before
> it write to both OSD and journal,so the write is directly from memory to
> OSDs. This mode suite for XFS and EXT4.
What ha
2013/7/22 Chen, Xiaoxi :
> Imaging you have several writes have been flushed to journal and acked,but
> not yet write to disk. Now the system crash by kernal panic or power
> failure,you will lose your data in ram disk,thus lose data that assumed to be
> successful written.
The same apply in ca
2013/7/22 Mark Nelson :
> I don't have any in my test lab, but the DC S3700 continues to look like a
> good option and has a great reputation, but might be a bit pricey. From that
> article it looks like the Micron P400m might be worth looking at too, but
> seems to be a bit slower.
DC S3500 shoul
Hi to all,
I would like to archieve a faul tollerance cluster with an infiniband network.
Actually, one rsocket is bound to a single IB port. In case of a
dual-port HBA, I have to use multiple rsockets to use both ports.
Is possible to configure ceph with multiple cluster addresses for each OSDs?
2013/6/20 Matthew Anderson :
> Hi All,
>
> I've had a few conversations on IRC about getting RDMA support into Ceph and
> thought I would give it a quick attempt to hopefully spur some interest.
> What I would like to accomplish is an RSockets only implementation so I'm
> able to use Ceph, RBD and
Hi to all.
Let's assume a Ceph cluster used to store VM disk images.
VMs will be booted directly from the RBD.
What will happens in case of OSD failure if the failed OSD is the
primary where VM is reading from ?
___
ceph-users mailing list
ceph-users@lis
2013/9/17 Gregory Farnum :
> The VM read will hang until a replica gets promoted and the VM resends the
> read. In a healthy cluster with default settings this will take about 15
> seconds.
Thank you.
___
ceph-users mailing list
ceph-users@lists.ceph.com
Hi to all.
Actually I'm building a test cluster with 3 OSD servers connected with
IPoIB for cluster networks and 10GbE for public network.
I have to connect these OSDs to some MONs servers located in another
rack with no gigabit or 10Gb connection.
Could I use some 10/100 networks ports? Which ki
Hi to all,
increasing the total numbers of MONs available in a cluster, for
example growing from 3 to 5, will also decrease the hardware
requirements (i.e. RAM and CPU) for each mon instance ?
I'm asking this because our cluster will be made with 5 OSD server and
I can easily put one MON on each O
2013/9/19 Joao Eduardo Luis :
> We have no benchmarks on that, that I am aware of. But the short and sweet
> answer should be "not really, highly unlikely".
>
> If anything, increasing the number of mons should increase the response
> time, although for such low numbers that should also be virtual
2013/10/24 Wido den Hollander :
> I have never seen one Intel SSD fail. I've been using them since the X25-M
> 80GB SSDs and those are still in production without even one wearing out or
> failing.
Which kind of SSD are you using, right now, as journal ?
___
Hi,
what do you think to use a USB pendrive as boot disk for OSDs nodes?
Pendrive are cheaper and bigger, and doing this will allow me to use
all spinning disks and SSDs as OSD storage/journal.
More over, in a future, i'll be able to boot from net replacing the
pendrive without loosing space on sp
2013/11/5 :
> It has been reported that the system is heavy on the OS during recovery;
Why? Recovery is made from OSDs/SSD, why ceph is heavy on OS disks?
There is nothing usefull to read from that disks during a recovery.
___
ceph-users mailing list
ce
Il 06/nov/2013 23:12 "Craig Lewis" ha scritto:
>
> For my Ceph cluster, I'm going back to SSDs for the OS. Instead of using
two of my precious 3.5" bays, I'm buying some PCI 2.5" drive bays:
http://www.amazon.com/Syba-Mount-Mobile-2-5-Inch-SY-MRA25023/dp/B0080V73RE,
and plugging them into the mot
Anybody using MONs and RGW inside docker containers?
I would like to use a server with two docker containers, one for mon
and one for RGW
This to archieve a better isolation between services and some reusable
components (the same container can be exported and used multiple times
on multiple server
2013/11/25 James Harper :
> Is the OS doing anything apart from ceph? Would booting a ramdisk-only system
> from USB or compact flash work?
This is the same question i've made some times ago.
Is ok to use USB as standard OS (OS, non OSD!) disk? OSDs and journals
will be on dedicated disks.
USB wi
Hi,
what do you think to use the same SSD as journal and as root partition?
Forexample:
1x 128GB SSD
6 OSD
15GB for each journal, for each OSD
5GB as root partition for OS.
This give me 105GB of used space and 23GB of unused space (i've read
somewhere that is better to not use the whole SSD f
2013/12/4 Simon Leinen :
> I think this is a fine configuration - you won't be writing to the root
> partition too much, outside journals. We also put journals on the same
> SSDs as root partitions (not that we're very ambitious about
> performance...).
Do you suggest a RAID1 for the OS partition
2013/12/6 Sebastien Han :
> @James: I think that Gandalf’s main idea was to save some costs/space on the
> servers so having dedicated disks is not an option. (that what I understand
> from your comment “have the OS somewhere else” but I could be wrong)
You are right. I don't have space for one
2013/11/7 Kyle Bader :
> Ceph handles it's own logs vs using syslog so I think your going to have to
> write to tmpfs and have a logger ship it somewhere else quickly. I have a
> feeling Ceph logs will eat a USB device alive, especially if you have to
> crank up debugging.
I wasn't aware of this.
2013/12/16 Gregory Farnum :
> There are log_to_syslog and err_to_syslog config options that will
> send the ceph log output there. I don't remember all the config stuff
> you need to set up properly and be aware of, but you should be able to
> find it by searching the list archives or the docs.
Th
Hi to all
I'm playing with ceph-deploy for the first time.
Some questions:
1. how can I set a cluster network to be used by OSDs? Should I set it manually?
2. does admin node need to be reachable from each other server or can
I use a natted workstation ?
__
2013/12/17 Alfredo Deza :
> The docs have a quick section to do this with ceph-deploy
> (http://ceph.com/docs/master/start/quick-ceph-deploy/)
> Have you seen that before? Or do you need something that covers a
> cluster in more detail?
There isnt' anything about how to define a cluster network fo
2013/12/17 Gandalf Corvotempesta :
> There isnt' anything about how to define a cluster network for OSD.
> I don't know how to set a cluster address to each OSD.
No help about this? I would like to set a cluster-address for each OSD
Is this possible
2013/12/17 Christian Balzer :
> Network:
> Infiniband QDR, 2x 18port switches (interconnected of course), redundant
> paths everywhere, including to the clients (compute nodes).
Are you using IPoIB ? How do you interconnect both switches without
making loops ? AFAIK, IB switches doesn't support ST
I'm looking at this:
https://github.com/ceph/ceph-cookbooks
seems to support the whole ceph stack (rgw, mons, osd, msd)
Here:
http://wiki.ceph.com/Guides/General_Guides/Deploying_Ceph_with_Chef#Configure_your_Ceph_Environment
I can see that I need to configure the environment as for example and
I
Hi,
I would like to customize my ceph.conf generated by ceph-deploy.
Should I customize ceph.conf stored on admin node and then sync it on
each ceph nodes?
If yes:
1. can I sync directly from ceph-deploy or I have to sync manually via scp ?
2. I don't see any host definition in ceph.conf, what wi
Hi.
I'm using ntpd on each ceph server and is syncing properly but every
time that I reboot, ceph starts in degraded mode with "clock skew"
warning.
The only way that I have to solve this is manually restart ceph on
each node (without resyncing clock)
Any suggestion ?
2014-01-30 Emmanuel Lacour :
> here, I just wait until the skew is finished, without touching ceph. It
> doesn't seems to do anything bad ...
I've waited more than 1 hour with no success.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists
Hi,
I have a working ceph cluster.
Is possible to add RGW replication across two sites in a second time
or is a feature that needs to be implemented from the start?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/c
2014-01-30 18:41 GMT+01:00 Eric Eastman :
> I have this problem on some of my Ceph clusters, and I think it is due to
> the older hardware the I am using does not have the best clocks. To fix the
> problem, I setup one server in my lab to be my local NTP time server, and
> then on each of my Ceph
Hi to all
I have this in my conf:
# grep 'pg num' /etc/ceph/ceph.conf
osd pool default pg num = 5600
But:
# ceph osd pool get data pg_num
pg_num: 64
Is this normal ? Why just 64 pg was created ?
___
ceph-users mailing list
ceph-users@lists.ceph.com
I've increased PG number to a running cluster.
After this operation, all OSDs from one node was marked as down.
Now, after a while, i'm seeing that OSDs are slowly coming up again
(sequentially) after rebalancing.
Is this an expected behaviour ?
___
cep
2014-03-13 9:02 GMT+01:00 Andrey Korolyov :
> Yes, if you have essentially high amount of commited data in the cluster
> and/or large number of PG(tens of thousands).
I've increased from 64 to 8192 PGs
> If you have a room to
> experiment with this transition from scratch you may want to play wit
2014-03-13 10:53 GMT+01:00 Kasper Dieter :
> After adding two new pools (each with 2 PGs)
> 100 out of 140 OSDs are going down + out.
> The cluster never recovers.
In my case, cluster recovered after a couple of hours.
How much time did you wait ?
__
2014-03-13 11:19 GMT+01:00 Dan Van Der Ster :
> Do you mean you used PG splitting?
>
> You should split PGs by a factor of 2x at a time. So to get from 64 to 8192,
> do 64->128, then 128->256, ..., 4096->8192.
I've brutally increased, no further steps.
64 -> 8192 :-)
2014-03-13 11:23 GMT+01:00 Gandalf Corvotempesta
:
> I've brutally increased, no further steps.
>
> 64 -> 8192 :-)
I'm also unsure if 8192 PGs are correct for my cluster.
At maximum i'll have 168 OSDs (14 servers, 12 disks each, 1 osd per
disk), with replica set to
2014-03-13 11:26 GMT+01:00 Dan Van Der Ster :
> See http://tracker.ceph.com/issues/6922
>
> This is explicity blocked in latest code (not sure if thats released yet).
This seems to explain my behaviour
___
ceph-users mailing list
ceph-users@lists.ceph.co
2014-03-13 11:32 GMT+01:00 Dan Van Der Ster :
> Do you have any other pools? Remember that you need to include _all_ pools
> in the PG calculation, not just a single pool.
Actually I have only standard pools (that should be 3)
In production i'll also have RGW.
So, which is the exact equation to d
So, in normal condition with RGW enabled, only 2 pools has data on it:
"data" and ".rgw.buckets" ?
In this case, I could use ReplicaNum*2
2014-03-13 11:48 GMT+01:00 Dan Van Der Ster :
> On 13 Mar 2014 at 11:41:30, Gandalf Corvotempesta
> (gandalf.corvotempe...@gmail.com
2014-03-13 12:59 GMT+01:00 Joao Eduardo Luis :
> Anyway, most timeouts will hold for 5 seconds. Allowing clock drifts up to
> 1 second may work, but we don't have hard data to support such claim. Over
> a second of drift may be problematic if the monitors are under some workload
> and message han
I'm getting these trying to upload any file:
2014-04-07 14:33:27.084369 7f5268f86700 5 Getting permissions
id=testuser owner=testuser perm=2
2014-04-07 14:33:27.084372 7f5268f86700 10 uid=testuser requested
perm (type)=2, policy perm=2, user_perm_mask=2, acl perm=2
2014-04-07 14:33:27.084377 7f5
2014-04-07 20:24 GMT+02:00 Yehuda Sadeh :
> Try bumping up logs (debug rgw = 20, debug ms = 1). Not enough info
> here to say much, note that it takes exactly 30 seconds for the
> gateway to send the error response, may be some timeout. I'd verify
> that the correct fastcgi module is running.
Sorr
-- Forwarded message --
From: Gandalf Corvotempesta
Date: 2014-04-09 14:31 GMT+02:00
Subject: Re: [ceph-users] RadosGW: bad request
To: Yehuda Sadeh
Cc: "ceph-users@lists.ceph.com"
2014-04-07 20:24 GMT+02:00 Yehuda Sadeh :
> Try bumping up logs (debug rgw = 20,
-- Forwarded message --
From: Gandalf Corvotempesta
Date: 2014-04-14 16:06 GMT+02:00
Subject: Fwd: [ceph-users] RadosGW: bad request
To: "ceph-users@lists.ceph.com"
-- Forwarded message ------
From: Gandalf Corvotempesta
Date: 2014-04-09 14:31 GMT+02:00
S
I'm trying to configure a small ceph cluster with both public and
cluster networks.
This is my conf:
[global]
public_network = 192.168.0/24
cluster_network = 10.0.0.0/24
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
fsid = 004baba0-74dc-4429-8
During a recovery, I'm hitting oom-killer for ceph-osd because it's
using more than 90% of avaialble ram (8GB)
How can I decrease the memory footprint during a recovery ?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinf
2014-04-24 18:09 GMT+02:00 Peter :
> Do you have a typo? :
>
> public_network = 192.168.0/24
>
>
> should this read:
>
> public_network = 192.168.0.0/24
Sorry, it was a typo when posting in list.
ceph.conf is correct.
___
ceph-users mailing list
ceph-use
cluster IP's defined in the host file on each OSD
> server? As I understand it, the mon's do not use a cluster network, only the
> OSD servers.
>
> -Original Message-
> From: ceph-users-boun...@lists.ceph.com
> [mailto:ceph-users-boun...@lists.ceph.com] On Be
ount(>~1e8) or any extraordinary configuration parameter.
>
> On Mon, Apr 28, 2014 at 12:26 AM, Gandalf Corvotempesta
> wrote:
>> So, are you suggesting to lower the pg count ?
>> Actually i'm using the suggested number of OSD*100/Replicas
>> and I have just 2 OSD
2014-04-27 23:20 GMT+02:00 Andrey Korolyov :
> For the record, ``rados df'' will give an object count. Would you mind
> to send out your ceph.conf? I cannot imagine what parameter may raise
> memory consumption so dramatically, so config at a glance may reveal
> some detail. Also core dump should b
So, are you suggesting to lower the pg count ?
Actually i'm using the suggested number of OSD*100/Replicas
and I have just 2 OSDs per server.
2014-04-24 19:34 GMT+02:00 Andrey Korolyov :
> On 04/24/2014 08:14 PM, Gandalf Corvotempesta wrote:
>> During a recovery, I'm hitting
2014-04-27 23:58 GMT+02:00 Andrey Korolyov :
> Nothing looks wrong, except heartbeat interval which probably should
> be smaller due to recovery considerations. Try ``ceph osd tell X heap
> release'' and if it will not change memory consumption, file a bug.
What should I look for running this ?
Se
2014-04-26 12:06 GMT+02:00 Gandalf Corvotempesta
:
> I've not defined cluster IPs for each OSD server but only the whole subnet.
> Should I define each IP for each OSD ? This is not wrote on docs and
> could be tricky to do this in big environments with hundreds of nodes
I've
2014-04-28 17:17 GMT+02:00 Kurt Bauer :
> What do you mean by "I see all OSDs down"?
I mean that my OSDs are detected as down:
$ sudo ceph osd tree
# id weight type name up/down reweight
-1 12.74 root default
-2 3.64 host osd13
0 1.82 osd.0 down 0
2 1.82 osd.2 down 0
-3 5.46 host osd12
1 1.82 osd
After a simple "service ceph restart" on a server, i'm unable to get
my cluster up again
http://pastebin.com/raw.php?i=Wsmfik2M
suddenly, some OSDs goes UP and DOWN randomly.
I don't see any network traffic on cluster interface.
How can I detect what ceph is doing ? From the posted output there i
I'm testing an idle ceph cluster.
my pgmap version is always increasing, is this normal ?
2014-04-30 17:20:41.934127 mon.0 [INF] pgmap v281: 640 pgs: 640
active+clean; 0 bytes data, 333 MB used, 14896 GB / 14896 GB avail
2014-04-30 17:20:42.962033 mon.0 [INF] pgmap v282: 640 pgs: 640
active+clean;
2014-04-30 22:11 GMT+02:00 Andrey Korolyov :
> regarding this one and previous you told about memory consumption -
> there are too much PGs, so memory consumption is so high as you are
> observing. Dead loop of osd-never-goes-up is probably because of
> suicide timeout of internal queues. It is may
2014-04-30 14:18 GMT+02:00 Sage Weil :
> Today we are announcing some very big news: Red Hat is acquiring Inktank.
Great news.
Any changes to get native Infiniband support in ceph like in GlusterFS ?
___
ceph-users mailing list
ceph-users@lists.ceph.com
2014-04-30 22:27 GMT+02:00 Mark Nelson :
> Check out the xio work that the linuxbox/mellanox folks are working on.
> Matt Benjamin has posted quite a bit of info to the list recently!
Is that usable ?
___
ceph-users mailing list
ceph-users@lists.ceph.com
2014-05-01 0:11 GMT+02:00 Mark Nelson :
> Usable is such a vague word. I imagine it's testable after a fashion. :D
Ok but I prefere an "official" support with IB integrated in main ceph repo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://l
2014-05-01 0:20 GMT+02:00 Matt W. Benjamin :
> Hi,
>
> Sure, that's planned for integration in Giant (see Blueprints).
Great. Any ETA? Firefly was planned for February :)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinf
Hi to all,
I would like to replace a disk used as journal (one partition for each OSD)
Which is the safest method to do so?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
2014-05-06 12:39 GMT+02:00 Andrija Panic :
> Good question - I'm also interested. Do you want to movejournal to dedicated
> disk/partition i.e. on SSD or just replace (failed) disk with new/bigger one
> ?
I would like to replace the disk with a bigger one (in fact, my new
disk is smaller, but this
2014-05-06 13:08 GMT+02:00 Dan Van Der Ster :
> I've followed this recipe successfully in the past:
>
> http://wiki.skytech.dk/index.php/Ceph_-_howto,_rbd,_lvm,_cluster#Add.2Fmove_journal_in_running_cluster
I'll try but my ceph.conf doesn't have any "osd journal" setting set
(i'm using ceph-ansibl
2014-05-06 14:09 GMT+02:00 Fred Yang :
> The journal location is not in ceph.conf, check
> /var/lib/ceph/osd/ceph-X/journal, which is a symlink to the osd's journal
> device.
Symlink are pointing to partition UUID this prevent the replacement
without manual intervetion:
journal -> /dev/disk/by-pa
2014-05-06 16:33 GMT+02:00 Gandalf Corvotempesta
:
> Symlink are pointing to partition UUID this prevent the replacement
> without manual intervetion:
>
> journal -> /dev/disk/by-partuuid/b234da10-dcad-40c7-aa97-92d35099e5a4
>
> is not possible to create symlink pointing t
2014-05-06 19:40 GMT+02:00 Craig Lewis :
> I haven't tried this yet, but I imagine that the process is similar to
> moving your journal from the spinning disk to an SSD.
My journals are on SSD. I have to replace that SSD.
___
ceph-users mailing list
ceph
Very simple question: what happen if server bound to the cache pool goes down?
For example, a read-only cache could be archived by using a single
server with no redudancy.
Is ceph smart enough to detect that cache is unavailable and
transparently redirect all request to the main pool as usual ?
Th
2014-05-08 18:43 GMT+02:00 Indra Pramana :
> Since we don't use ceph.conf to indicate the data and journal paths, how can
> I recreate the journal partitions?
1. Dump the partition scheme:
sgdisk --backup=/tmp/journal_table /dev/sdd
2. Replace the journal disk device
3. Restore the old partition
Let's assume a test cluster up and running with real data on it.
Which is the best way to migrate everything to a production (and
larger) cluster?
I'm thinking to add production MONs to the test cluster, after that,
add productions OSDs to the test cluster, waiting for a full rebalance
and then st
2014-05-09 15:55 GMT+02:00 Sage Weil :
> This looks correct to me!
Some command to automate this in ceph would be nice.
For example, skipping the "mkjournal" step:
ceph-osd -i 30 --mkjournal
ceph-osd -i 31 --mkjournal
ceph should be smarth enough to automatically make journals if missing
so that
2014-05-13 21:21 GMT+02:00 Gregory Farnum :
> You misunderstand. Migrating between machines for incrementally
> upgrading your hardware is normal behavior and well-tested (likewise
> for swapping in all-new hardware, as long as you understand the IO
> requirements involved). So is decommissioning o
Let's assume that everything went very very bad and i have to manually
recover a cluster with an unconfigured ceph.
1. How can i recover datas directly from raw disks? Is this possible?
2. How can i restore a ceph cluster (and have data back) by using
existing disks?
3. How do you manage backups
Hi,
How ceph detect and manage disk failures? What happens if some data are
wrote on a bad sector?
Are there any change to get the bad sector "distributed" across the cluster
due to the replication?
Is ceph able to remove the OSD bound to the failed disk automatically?
__
2016-06-08 20:49 GMT+02:00 Krzysztof Nowicki :
> From my own experience with failing HDDs I've seen cases where the drive was
> failing silently initially. This manifested itself in repeated deep scrub
> failures. Correct me if I'm wrong here, but Ceph keeps checksums of data
> being written and in
Il 09 giu 2016 02:09, "Christian Balzer" ha scritto:
> Ceph currently doesn't do any (relevant) checksumming at all, so if a
> PRIMARY PG suffers from bit-rot this will be undetected until the next
> deep-scrub.
>
> This is one of the longest and gravest outstanding issues with Ceph and
> supposed
2016-06-09 9:16 GMT+02:00 Christian Balzer :
> Neither, a journal failure is lethal for the OSD involved and unless you
> have LOTS of money RAID1 SSDs are a waste.
Ok, so if a journal failure is lethal, ceph automatically remove the
affected OSD
and start rebalance, right ?
> Additionally your c
Last time i've used Ceph (about 2014) RDMA/Infiniband support was just
a proof of concept
and I was using IPoIB with low performance (about 8-10GB/s on a
Infiniband DDR 20Gb/s)
This was 2 years ago. Any news about this? Is RDMA/Infiniband
supported like with GlusterFS?
2016-06-09 10:18 GMT+02:00 Christian Balzer :
> IPoIB is about half the speed of your IB layer, yes.
Ok, so it's normal. I've seen benchmarks on net stating that IPoIB on
DDR should reach about 16-17Gb/s
I'll plan to move to QDR
> And bandwidth is (usually) not the biggest issue, latency is.
I'v
2016-06-09 10:28 GMT+02:00 Christian Balzer :
> Define "small" cluster.
Max 14 OSD nodes with 12 disks each, replica 3.
> Your smallest failure domain both in Ceph (CRUSH rules) and for
> calculating how much over-provisioning you need should always be the
> node/host.
> This is the default CRUSH
Il 09 giu 2016 15:41, "Adam Tygart" ha scritto:
>
> If you're
> using pure DDR, you may need to tune the broadcast group in your
> subnet manager to set the speed to DDR.
Do you know how to set this with opensm?
I would like to bring up my test cluster again next days
Il 15 giu 2016 03:27, "Christian Balzer" ha scritto:
> And that makes deep-scrubbing something of quite limited value.
This is not true.
If you checksum *before* writing to disk (so when data is still in ram)
then when reading back from disk you could do the checksum verification and
if doesn't m
Il 15 giu 2016 09:42, "Christian Balzer" ha scritto:
>
> This is why people are using BTRFS and ZFS for filestore (despite the
> problems they in turn create) and why the roadmap for bluestore has
> checksums for reads on it as well (or so we've been told).
Bitrot happens only on files?
what abou
Il 15 giu 2016 09:58, "Christian Balzer" ha scritto
> You _do_ know how and where Ceph/RBD store their data?
>
> Right now that's on disks/SSDs, formated with a file system.
> And XFS or EXT4 will not protect against bitrot, while BTRFS and ZFS will.
>
Wait, I'm new to ceph and some things are no
Let's assume a fully redundant network.
We need 4 switches, 2 for the public network, 2 for the cluster network.
10GBase-T has higher latency than SFP+ but are also cheaper, as manu
new servers ha 10GBaseT integrated onboard and there is no need for
twinax cables or transceaver.
I think that low
2016-06-15 22:13 GMT+02:00 Nick Fisk :
> I would reconsider if you need separate switches for each network, vlans
> would normally be sufficient. If bandwidth is not an issue, you could even
> tag both vlans over the same uplinks. Then there is the discussion around
> whether separate networks are
2016-06-15 22:59 GMT+02:00 Nick Fisk :
> Possibly, but by how much? 20GB of bandwidth is a lot to feed 12x7.2k disks,
> particularly if they start doing any sort of non-sequential IO.
Assuming 100MB/s for each SATA disk, 12 disks are 1200MB/s = 9600mbit/s
Why are you talking about 20Gb/s ? By usi
2016-06-16 3:53 GMT+02:00 Christian Balzer :
> Gandalf, first read:
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg29546.html
>
> And this thread by Nick:
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg29708.html
Interesting reading. Thanks.
> Overly optimistic.
> In an
2016-06-16 12:54 GMT+02:00 Oliver Dzombic :
> aside from the question of the coolness factor of Infinitiband,
> you should always also consider the question of replacing parts and
> extending cluster.
>
> A 10G Network environment is up to date currently, and will be for some
> more years. You can
As I'm planning a new cluster where to move all my virtual machine
(currently on local storage on each hypervisor) i would like to evaluate
the current IOPS on each server
Knowing the current iops i'll be able to know how many iops i need on ceph
I'm not an expert, do know know how to get this in
2016-06-17 10:03 GMT+02:00 Christian Balzer :
> I'm unfamilar with Xen and Xenserver (the later doesn't support RBD, btw),
> but if you can see all the combined activity of your VMs on your HW in the
> dom0 like with KVM/qemu, a simple "iostat" or "iostat -x" will give you the
> average IOPS of a d
Il 18 giu 2016 07:10, "Christian Balzer" ha scritto:
> That sounds extremely high, is that more or less consistent?
> How many VMs is that for?
> What are you looking at, as in are those individual disks/SSDs, a raid
> (what kind)?
800-1000 was a peak in a about 5 minutes. it was just a test to s
2013/2/26 Yehuda Sadeh :
> The admin endpoint is 'admin' by default. You set it through the 'rgw
> admin entry' configurable.
What do you mean with "endpoint"? Actually I'm able to get usages
stats (after adding the usage caps in read only to my users) from
"bucket" admin:
GET /admin/usage
Host:
1 - 100 of 140 matches
Mail list logo