Re: [DISCUSS] Mango indexes on FDB

Alex Miller Tue, 24 Mar 2020 13:53:14 -0700

> On Mar 24, 2020, at 05:51, Garren Smith <gar...@apache.org> wrote:
> On Tue, Mar 24, 2020 at 1:30 AM Joan Touzet <woh...@apache.org 
> <mailto:woh...@apache.org>> wrote:
> 
>> Question: Imagine a node that's been offline for a bit and is just
>> coming back on. (I'm not 100% sure how this works in FDB land.) If
>> there's a (stale) index on disk, and the index is being updated, and the
>> index on disk is kind of stale...what happens?
>> 
> 
> With couchdb_layer this can't happen as each CouchDB node is stateless and
> doesn't actually keep any indexes. Everything would be in FoundationDB. So
> if the index is built then it is built and ready for all couch_layer nodes.
> 
> FoundationDB storage servers could fall behind the Tlogs. I'm not 100% sure
> what would happen in this case. But it would be consistent for all
> couch_layer nodes.


When a client gets a read version to begin a transaction in FDB, it is promised 
that this was the most recent version at some point in time between issuing the 
request and receiving the reply.  When it issues reads, those reads must 
include the version, and must get back the most recently written value for that 
key as of the included version.  FDB is not allowed to break this contract 
during faults.

The cluster will continue advancing in versions, as it does not throttle if 
only one server in a shard falls behind (or is offline).  When the server comes 
back online, it will pull the stream of mutations from the transaction logs to 
catch up.  In the meantime, it will continue to be unavailable for reads until 
it catches up, as clients send read requests for a specific (recent) version 
that the lagging storage server knows that it does not have.  After 1s, it will 
reply with a `future_version` error to tell the client it won’t be getting an 
answer soon.  The client will then make a decision based upon either the error 
or observed latency to re-issue the read to a different replica of that shard 
so that it may get an answer, and will continue doing so until it notices that 
the lagged storage server has caught up and is responding successfully.

If you’re interested in more details around the operational side of a storage 
server failure, I’d suggest reading the threads that Kyle Snavely started on 
the FDB Forums:
https://forums.foundationdb.org/t/quick-question-on-tlog-disk-space-for-large-clusters/1962
 
<https://forums.foundationdb.org/t/quick-question-on-tlog-disk-space-for-large-clusters/1962>
https://forums.foundationdb.org/t/questions-regarding-maintenance-for-multiple-storage-oriented-machines-in-a-data-hall/2010
 
<https://forums.foundationdb.org/t/questions-regarding-maintenance-for-multiple-storage-oriented-machines-in-a-data-hall/2010>

Re: [DISCUSS] Mango indexes on FDB

Reply via email to