Re: Shard Splitting API Proposal

Jan Lehnardt Wed, 13 Feb 2019 08:52:42 -0800

> On 13. Feb 2019, at 17:12, Nick Vatamaniuc <[email protected]> wrote:
> 
> Hi Jan,
> 
> Thanks for taking a look!
> 
> On Wed, Feb 13, 2019 at 6:28 AM Jan Lehnardt <[email protected]> wrote:
> 
>> Nick, this is great, I have a few tiny nits left, apologies I only now got
>> to it.
>> 
>>> On 12. Feb 2019, at 18:08, Nick Vatamaniuc <[email protected]> wrote:
>>> 
>>> Shard Splitting API Proposal
>>> 
>>> I'd like thank everyone who contributed to the API discussion. As a
>> result
>>> we have a much better and consistent API that what we started with.
>>> 
>>> Before continuing I wanted to summarize to see what we ended up with. The
>>> main changes since the initial proposal were switching to using /_reshard
>>> as the main endpoint and having a detailed state transition history for
>>> jobs.
>>> 
>>> * GET /_reshard
>>> 
>>> Top level summary. Besides the new _reshard endpoint, there `reason` and
>>> the stats are more detailed.
>>> 
>>> Returns
>>> 
>>> {
>>>   "completed": 3,
>>>   "failed": 4,
>>>   "running": 0,
>>>   "state": "stopped",
>>>   "state_reason": "Manual rebalancing",
>>>   "stopped": 0,
>>>   "total": 7
>>> }
>>> 
>>> * PUT /_reshard/state
>>> 
>>> Start or stop global rebalacing.
>>> 
>>> Body
>>> 
>>> {
>>>   "state": "stopped",
>>>   "reason": "Manual rebalancing"
>>> }
>>> 
>>> Returns
>>> 
>>> {
>>>   "ok": true
>>> }
>>> 
>>> * GET /_reshard/state
>>> 
>>> Return global resharding state and reason.
>>> 
>>> {
>>>   "reason": "Manual rebalancing",
>>>   "state": “stopped”
>>> }
>> 
>> More a note than a change request, but `state` is a very generic term that
>> often confuses folks when they are new to something. If the set of possible
>> states is `started` and `stopped`, how about making this endpoint a boolean?
>> 
>> /_reshard/enabled
>> 
>> {
>>   "enabled": true|false,
>>   "reason": "Manual rebalancing"
>> }
>> 
>> 
> I thought of that as well. However _reshard/state seemed consistent with
> _reshard/jobs/$jobid/state. Setting "state":"stopped" _reshard/state will
> lead to all individual running job state to become "stopped" as well. And
> "running" will make jobs that are not individually stopped also become
> "running". In other words since it directly toggle job's state (with a job
> being to override stopped state) I like that it had the same arguments

Got it, makes perfect sense.

> and": true|false
> 
> 
>> 
>>> * GET /_reshard/jobs
>>> 
>>> Get the state of all the resharding jobs on the cluster. Now we have a
>>> detailed
>>> state transition history which looks similar what _scheduler/jobs have.
>>> 
>>> {
>>>   "jobs": [
>>>       {
>>>           "history": [
>>>               {
>>>                   "detail": null,
>>>                   "timestamp": "2019-02-06T22:28:06Z",
>>>                   "type": "new"
>>>               },
>>>               ...
>>>               {
>>>                   "detail": null,
>>>                   "timestamp": "2019-02-06T22:28:10Z",
>>>                   "type": "completed"
>>>               }
>>>           ],
>>>           "id":
>>> "001-0a308ef9f7bd24bd4887d6e619682a6d3bb3d0fd94625866c5216ec1167b4e23",
>>>           "job_state": "completed",
>>>           "node": "[email protected]",
>>>           "source": "shards/00000000-ffffffff/db1.1549492084",
>>>           "split_state": "completed",
>>>           "start_time": "2019-02-06T22:28:06Z",
>>>           "state_info": {},
>>>           "targets": [
>>>               "shards/00000000-7fffffff/db1.1549492084",
>>>               "shards/80000000-ffffffff/db1.1549492084"
>>>           ],
>> 
>> Since we went from /_split to /_reshard to prepare for merging shards, we
>> should reconsider source (singular) and targets (plural). Either a merge
>> job (in the future) uses sources (plural) and target (singular) and the job
>> schema is intentionally different, or we unify things to, maybe singular:
>> source/target which would have the nice property of being analogous to our
>> replication job schema. The type definition then is source:String and
>> target:Array(2) for split jobs and source:Array(2) target:String for
>> (future) merge jobs.
>> 
>> 
> Joan suggested adding a "type" field to both job creation POST body and
> also returning it when we inspect the job(s) state. So the "type":"split"
> would toggle the schema. It could be "merge" in the future, or even
> something like "rebalance" where it would merge some and split others
> perhaps :-) and since we have a type it would be easier to differentiate
> between the merge and split jobs. But if there is a consensus from others
> about switching targets to target that's easily as well.

Ah, I’m less concerned here about not being able to tell whether it’s a
split or a merge, and more about that having an indiscriminate plural
form (sourceS/targetS) depending on the type. It’s just an easy thing to
get wrong.

In addition, we already have source/target in CouchDB replication,
which people already use successfully, so making a similar thing that
behaves slightly differently doesn’t sit quite right with me.

I understand that I’m arguing to remove an ’s’ for very nitpicky
but these are the kind of nitpick discussions we’ve done a lot in
the early days which resulted in a by and large decent API that
has served as well, and it’s something I’d like to see taken forward.
Apologies if this all sounds very strict ;)

> 
>> 
>> And just another question, sorry if I missed this elsewhere, would we ever
>> consider adding to split/merge ratio different from 1:2, say 1:4, or will
>> folks have to run 1:2, 1:2, 1:2 to get to the same result? I’m fine with
>> either and if 1:2 fixed makes things simpler, I’m all for it ;)
>> 
>> 
> Good point. Actually it's already implemented that way already :-) Right
> below the API surface it has a split=2 parameter and it just creates the
> targets based on that. It could be 2, 3, 4, ... 10 etc. However I was
> thinking of keeping it hard coded at 2 at first to keep the behavior
> simpler at first and open that parameter to be user facing in a later
> release based on user feedback.

Ace, again, fully on board with shipping 1:2 first and maybe offering other
options later.

Best
Jan
—

> 
> Cheers,
> 
> -Nick
Re: Shard Splitting API Proposal

Reply via email to