On Wed, 15 Jun 2016 12:46:49 +0200 Gandalf Corvotempesta wrote:
> Il 15 giu 2016 09:58, "Christian Balzer" ha scritto
> > You _do_ know how and where Ceph/RBD store their data?
> >
> > Right now that's on disks/SSDs, formated with a file system.
> > And XFS or EXT4 will not protect against bitrot
Il 15 giu 2016 09:58, "Christian Balzer" ha scritto
> You _do_ know how and where Ceph/RBD store their data?
>
> Right now that's on disks/SSDs, formated with a file system.
> And XFS or EXT4 will not protect against bitrot, while BTRFS and ZFS will.
>
Wait, I'm new to ceph and some things are no
On Wed, 15 Jun 2016 09:50:43 +0200 Gandalf Corvotempesta wrote:
> Il 15 giu 2016 09:42, "Christian Balzer" ha scritto:
> >
> > This is why people are using BTRFS and ZFS for filestore (despite the
> > problems they in turn create) and why the roadmap for bluestore has
> > checksums for reads on i
Il 15 giu 2016 09:42, "Christian Balzer" ha scritto:
>
> This is why people are using BTRFS and ZFS for filestore (despite the
> problems they in turn create) and why the roadmap for bluestore has
> checksums for reads on it as well (or so we've been told).
Bitrot happens only on files?
what abou
Hello,
On Wed, 15 Jun 2016 08:48:57 +0200 Gandalf Corvotempesta wrote:
> Il 15 giu 2016 03:27, "Christian Balzer" ha scritto:
> > And that makes deep-scrubbing something of quite limited value.
>
> This is not true.
Did you read what I and Jan wrote?
> If you checksum *before* writing to dis
Il 15 giu 2016 03:27, "Christian Balzer" ha scritto:
> And that makes deep-scrubbing something of quite limited value.
This is not true.
If you checksum *before* writing to disk (so when data is still in ram)
then when reading back from disk you could do the checksum verification and
if doesn't m
This is why I use btrfs mirror sets underneath ceph and hopefully more
than make up for the space loss by going with 2 replicas instead of 3
and on the fly lzo compression. The ceph deep scrubs replace any need
for btrfs scrubs, but I still get the benefit of self healing when btrfs
finds bit
Hello,
On Tue, 14 Jun 2016 14:26:41 +0200 Jan Schermer wrote:
> Hi,
> bit rot is not "bit rot" per se - nothing is rotting on the drive
> platter.
Never mind that I used the wrong terminology (according to Wiki) and that
my long experience with "laser-rot" probably caused me to choose that
ter
Hi,
bit rot is not "bit rot" per se - nothing is rotting on the drive platter. It
occurs during reads (mostly, anyway), and it's random.
You can happily read a block and get the correct data, then read it again and
get garbage, then get correct data again.
This could be caused by a worn out cell
2016-06-09 10:28 GMT+02:00 Christian Balzer :
> Define "small" cluster.
Max 14 OSD nodes with 12 disks each, replica 3.
> Your smallest failure domain both in Ceph (CRUSH rules) and for
> calculating how much over-provisioning you need should always be the
> node/host.
> This is the default CRUSH
Hello,
On Thu, 9 Jun 2016 09:59:04 +0200 Gandalf Corvotempesta wrote:
> 2016-06-09 9:16 GMT+02:00 Christian Balzer :
> > Neither, a journal failure is lethal for the OSD involved and unless
> > you have LOTS of money RAID1 SSDs are a waste.
>
> Ok, so if a journal failure is lethal, ceph automa
2016-06-09 9:16 GMT+02:00 Christian Balzer :
> Neither, a journal failure is lethal for the OSD involved and unless you
> have LOTS of money RAID1 SSDs are a waste.
Ok, so if a journal failure is lethal, ceph automatically remove the
affected OSD
and start rebalance, right ?
> Additionally your c
Hello,
On Thu, 9 Jun 2016 08:43:23 +0200 Gandalf Corvotempesta wrote:
> Il 09 giu 2016 02:09, "Christian Balzer" ha scritto:
> > Ceph currently doesn't do any (relevant) checksumming at all, so if a
> > PRIMARY PG suffers from bit-rot this will be undetected until the next
> > deep-scrub.
> >
>
Il 09 giu 2016 02:09, "Christian Balzer" ha scritto:
> Ceph currently doesn't do any (relevant) checksumming at all, so if a
> PRIMARY PG suffers from bit-rot this will be undetected until the next
> deep-scrub.
>
> This is one of the longest and gravest outstanding issues with Ceph and
> supposed
Hello,
On Wed, 08 Jun 2016 20:26:56 + Krzysztof Nowicki wrote:
> Hi,
>
> śr., 8.06.2016 o 21:35 użytkownik Gandalf Corvotempesta <
> gandalf.corvotempe...@gmail.com> napisał:
>
> > 2016-06-08 20:49 GMT+02:00 Krzysztof Nowicki <
> > krzysztof.a.nowi...@gmail.com>:
> > > From my own experien
As long as there hasn't been a change recently Ceph does not store checksums.
Deep scrub compares checksums across replicas.
See
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-October/034646.html
Am 8. Juni 2016 22:27:46 schrieb Krzysztof Nowicki
:
Hi,
śr., 8.06.2016 o 21:35 u
Am 8. Juni 2016 22:27:46 schrieb Krzysztof Nowicki
:
Hi,
śr., 8.06.2016 o 21:35 użytkownik Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> napisał:
2016-06-08 20:49 GMT+02:00 Krzysztof Nowicki <
krzysztof.a.nowi...@gmail.com>:
> From my own experience with failing HDDs I've seen
Hi,
śr., 8.06.2016 o 21:35 użytkownik Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> napisał:
> 2016-06-08 20:49 GMT+02:00 Krzysztof Nowicki <
> krzysztof.a.nowi...@gmail.com>:
> > From my own experience with failing HDDs I've seen cases where the drive
> was
> > failing silently initia
2016-06-08 20:49 GMT+02:00 Krzysztof Nowicki :
> From my own experience with failing HDDs I've seen cases where the drive was
> failing silently initially. This manifested itself in repeated deep scrub
> failures. Correct me if I'm wrong here, but Ceph keeps checksums of data
> being written and in
Hi,
>From my own experience with failing HDDs I've seen cases where the drive
was failing silently initially. This manifested itself in repeated deep
scrub failures. Correct me if I'm wrong here, but Ceph keeps checksums of
data being written and in case that data is read back corrupted on one of
Hi,
How ceph detect and manage disk failures? What happens if some data are
wrote on a bad sector?
Are there any change to get the bad sector "distributed" across the cluster
due to the replication?
Is ceph able to remove the OSD bound to the failed disk automatically?
__
21 matches
Mail list logo