Never mind, I should’ve read the whole thread first. > On Nov 2, 2017, at 10:50 AM, Hans van den Bogert <hansbog...@gmail.com> wrote: > > >> On Nov 1, 2017, at 4:45 PM, David Turner <drakonst...@gmail.com >> <mailto:drakonst...@gmail.com>> wrote: >> >> All it takes for data loss is that an osd on server 1 is marked down and a >> write happens to an osd on server 2. Now the osd on server 2 goes down >> before the osd on server 1 has finished backfilling and the first osd >> receives a request to modify data in the object that it doesn't know the >> current state of. Tada, you have data loss. > > I’m probably misunderstanding, but if a osd on server 1 is backfilling, and > its only candidate to backfill from is an osd on server 2, and the latter > goes down; then wouldn’t the osd on server 1 block, i.e., not accept requests > to modify, until server 1 comes up again? > Or is there a ‘hole' here somewhere where server 1 *thinks* it’s done > backfilling whereas the osdmap it used to backfill with was out of date? > > Thanks, > > Hans
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com