Hi Greg,

Accepting the fact, that an osd with outdated data can never accept write, or io of any kind, how is it possible, that the system goes into this state?

-All osds are Bluestore, checksum, mtime etc.

-All osds are up and in

-No hw failures, lost disks, damaged journals or databases etc.

-The data became inconsistent


Thanks,

Denke.


On 11/02/2017 11:51 PM, Gregory Farnum wrote:

On Thu, Nov 2, 2017 at 1:21 AM koukou73gr <koukou7...@yahoo.com <mailto:koukou7...@yahoo.com>> wrote:

    The scenario is actually a bit different, see:

    Let's assume size=2, min_size=1
    -We are looking at pg "A" acting [1, 2]
    -osd 1 goes down
    -osd 2 accepts a write for pg "A"
    -osd 2 goes down
    -osd 1 comes back up, while osd 2 still down
    -osd 1 has no way to know osd 2 accepted a write in pg "A"
    -osd 1 accepts a new write to pg "A"
    -osd 2 comes back up.

    bang! osd 1 and 2 now have different views of pg "A" but both claim to
    have current data.


In this case, OSD 1 will not accept IO precisely because it can not prove it has the current data. That is the basic purpose of OSD peering and holds in all cases.
-Greg



    -K.

    On 2017-11-01 20:27, Denes Dolhay wrote:
    > Hello,
    >
    > I have a trick question for Mr. Turner's scenario:
    > Let's assume size=2, min_size=1
    > -We are looking at pg "A" acting [1, 2]
    > -osd 1 goes down, OK
    > -osd 1 comes back up, backfill of pg "A" commences from osd 2 to
    osd 1, OK
    > -osd 2 goes down (and therefore pg "A" 's backfill to osd 1 is
    > incomplete and stopped) not OK, but this is the case...
    > --> In this event, why does osd 1 accept IO to pg "A" knowing
    full well,
    > that it's data is outdated and will cause an inconsistent state?
    > Wouldn't it be prudent to deny io to pg "A" until either
    > -osd 2 comes back (therefore we have a clean osd in the acting
    group)
    > ... backfill would continue to osd 1 of course
    > -or data in pg "A" is manually marked as lost, and then continues
    > operation from osd 1 's (outdated) copy?
    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to