Re: Shard Splitting API Proposal

Jan Lehnardt Sat, 16 Feb 2019 23:32:48 -0800

Heya Nick,

Nicely done. I think even though the majority of the discussion had already 
happened here, the RFC nicely pulled together the various discussion threads 
into a coherent whole.


I would imagine the discussion on GH would be similarly fruitful.

I gave it my +1, and as I said on the outset: I'm very excited about this 
feature!

Best
Jan
—

> On 15. Feb 2019, at 23:45, Nick Vatamaniuc <vatam...@gmail.com> wrote:
> 
> Decided to kick the tires on the new RFC proposal issue type and created
> one for shard splitting:
> 
> https://github.com/apache/couchdb/issues/1920
> 
> Let's see how it goes. Being it's the first one let me know if I missed
> anything obvious.
> 
> Also I'd like to thank everyone who contributed to the discussion. The API
> is looking more solid and is much improved from where it started.
> 
> Cheers,
> -Nick
> 
> 
> 
>> On Wed, Feb 13, 2019 at 12:03 PM Nick Vatamaniuc <vatam...@gmail.com> wrote:
>> 
>> 
>> 
>>> On Wed, Feb 13, 2019 at 11:52 AM Jan Lehnardt <j...@apache.org> wrote:
>>> 
>>> 
>>> 
>>>> On 13. Feb 2019, at 17:12, Nick Vatamaniuc <vatam...@gmail.com> wrote:
>>>> 
>>>> Hi Jan,
>>>> 
>>>> Thanks for taking a look!
>>>> 
>>>>> On Wed, Feb 13, 2019 at 6:28 AM Jan Lehnardt <j...@apache.org> wrote:
>>>>> 
>>>>> Nick, this is great, I have a few tiny nits left, apologies I only now
>>> got
>>>>> to it.
>>>>> 
>>>>>> On 12. Feb 2019, at 18:08, Nick Vatamaniuc <vatam...@gmail.com>
>>> wrote:
>>>>>> 
>>>>>> Shard Splitting API Proposal
>>>>>> 
>>>>>> I'd like thank everyone who contributed to the API discussion. As a
>>>>> result
>>>>>> we have a much better and consistent API that what we started with.
>>>>>> 
>>>>>> Before continuing I wanted to summarize to see what we ended up with.
>>> The
>>>>>> main changes since the initial proposal were switching to using
>>> /_reshard
>>>>>> as the main endpoint and having a detailed state transition history
>>> for
>>>>>> jobs.
>>>>>> 
>>>>>> * GET /_reshard
>>>>>> 
>>>>>> Top level summary. Besides the new _reshard endpoint, there `reason`
>>> and
>>>>>> the stats are more detailed.
>>>>>> 
>>>>>> Returns
>>>>>> 
>>>>>> {
>>>>>>  "completed": 3,
>>>>>>  "failed": 4,
>>>>>>  "running": 0,
>>>>>>  "state": "stopped",
>>>>>>  "state_reason": "Manual rebalancing",
>>>>>>  "stopped": 0,
>>>>>>  "total": 7
>>>>>> }
>>>>>> 
>>>>>> * PUT /_reshard/state
>>>>>> 
>>>>>> Start or stop global rebalacing.
>>>>>> 
>>>>>> Body
>>>>>> 
>>>>>> {
>>>>>>  "state": "stopped",
>>>>>>  "reason": "Manual rebalancing"
>>>>>> }
>>>>>> 
>>>>>> Returns
>>>>>> 
>>>>>> {
>>>>>>  "ok": true
>>>>>> }
>>>>>> 
>>>>>> * GET /_reshard/state
>>>>>> 
>>>>>> Return global resharding state and reason.
>>>>>> 
>>>>>> {
>>>>>>  "reason": "Manual rebalancing",
>>>>>>  "state": “stopped”
>>>>>> }
>>>>> 
>>>>> More a note than a change request, but `state` is a very generic term
>>> that
>>>>> often confuses folks when they are new to something. If the set of
>>> possible
>>>>> states is `started` and `stopped`, how about making this endpoint a
>>> boolean?
>>>>> 
>>>>> /_reshard/enabled
>>>>> 
>>>>> {
>>>>>  "enabled": true|false,
>>>>>  "reason": "Manual rebalancing"
>>>>> }
>>>>> 
>>>>> 
>>>> I thought of that as well. However _reshard/state seemed consistent with
>>>> _reshard/jobs/$jobid/state. Setting "state":"stopped" _reshard/state
>>> will
>>>> lead to all individual running job state to become "stopped" as well.
>>> And
>>>> "running" will make jobs that are not individually stopped also become
>>>> "running". In other words since it directly toggle job's state (with a
>>> job
>>>> being to override stopped state) I like that it had the same arguments
>>> 
>>> Got it, makes perfect sense.
>>> 
>>>> and": true|false
>>>> 
>>>> 
>>>>> 
>>>>>> * GET /_reshard/jobs
>>>>>> 
>>>>>> Get the state of all the resharding jobs on the cluster. Now we have a
>>>>>> detailed
>>>>>> state transition history which looks similar what _scheduler/jobs
>>> have.
>>>>>> 
>>>>>> {
>>>>>>  "jobs": [
>>>>>>      {
>>>>>>          "history": [
>>>>>>              {
>>>>>>                  "detail": null,
>>>>>>                  "timestamp": "2019-02-06T22:28:06Z",
>>>>>>                  "type": "new"
>>>>>>              },
>>>>>>              ...
>>>>>>              {
>>>>>>                  "detail": null,
>>>>>>                  "timestamp": "2019-02-06T22:28:10Z",
>>>>>>                  "type": "completed"
>>>>>>              }
>>>>>>          ],
>>>>>>          "id":
>>>>>> 
>>> "001-0a308ef9f7bd24bd4887d6e619682a6d3bb3d0fd94625866c5216ec1167b4e23",
>>>>>>          "job_state": "completed",
>>>>>>          "node": "node1@127.0.0.1",
>>>>>>          "source": "shards/00000000-ffffffff/db1.1549492084",
>>>>>>          "split_state": "completed",
>>>>>>          "start_time": "2019-02-06T22:28:06Z",
>>>>>>          "state_info": {},
>>>>>>          "targets": [
>>>>>>              "shards/00000000-7fffffff/db1.1549492084",
>>>>>>              "shards/80000000-ffffffff/db1.1549492084"
>>>>>>          ],
>>>>> 
>>>>> Since we went from /_split to /_reshard to prepare for merging shards,
>>> we
>>>>> should reconsider source (singular) and targets (plural). Either a
>>> merge
>>>>> job (in the future) uses sources (plural) and target (singular) and
>>> the job
>>>>> schema is intentionally different, or we unify things to, maybe
>>> singular:
>>>>> source/target which would have the nice property of being analogous to
>>> our
>>>>> replication job schema. The type definition then is source:String and
>>>>> target:Array(2) for split jobs and source:Array(2) target:String for
>>>>> (future) merge jobs.
>>>>> 
>>>>> 
>>>> Joan suggested adding a "type" field to both job creation POST body and
>>>> also returning it when we inspect the job(s) state. So the
>>> "type":"split"
>>>> would toggle the schema. It could be "merge" in the future, or even
>>>> something like "rebalance" where it would merge some and split others
>>>> perhaps :-) and since we have a type it would be easier to differentiate
>>>> between the merge and split jobs. But if there is a consensus from
>>> others
>>>> about switching targets to target that's easily as well.
>>> 
>>> Ah, I’m less concerned here about not being able to tell whether it’s a
>>> split or a merge, and more about that having an indiscriminate plural
>>> form (sourceS/targetS) depending on the type. It’s just an easy thing to
>>> get wrong.
>>> 
>>> In addition, we already have source/target in CouchDB replication,
>>> which people already use successfully, so making a similar thing that
>>> behaves slightly differently doesn’t sit quite right with me.
>>> 
>>> I understand that I’m arguing to remove an ’s’ for very nitpicky
>>> but these are the kind of nitpick discussions we’ve done a lot in
>>> the early days which resulted in a by and large decent API that
>>> has served as well, and it’s something I’d like to see taken forward.
>>> Apologies if this all sounds very strict ;)
>>> 
>>> 
>> Thanks for the longer explanation. I understand now and agree, let's make
>> it target. No worries about sounding nitpicky we should be nitpicky about
>> APIs!
>> 
>> 
>>>> 
>>>>> 
>>>>> And just another question, sorry if I missed this elsewhere, would we
>>> ever
>>>>> consider adding to split/merge ratio different from 1:2, say 1:4, or
>>> will
>>>>> folks have to run 1:2, 1:2, 1:2 to get to the same result? I’m fine
>>> with
>>>>> either and if 1:2 fixed makes things simpler, I’m all for it ;)
>>>>> 
>>>>> 
>>>> Good point. Actually it's already implemented that way already :-) Right
>>>> below the API surface it has a split=2 parameter and it just creates the
>>>> targets based on that. It could be 2, 3, 4, ... 10 etc. However I was
>>>> thinking of keeping it hard coded at 2 at first to keep the behavior
>>>> simpler at first and open that parameter to be user facing in a later
>>>> release based on user feedback.
>>> 
>>> Ace, again, fully on board with shipping 1:2 first and maybe offering
>>> other
>>> options later.
>>> 
>>> Best
>>> Jan
>>> —
>>> 
>>>> 
>>>> Cheers,
>>>> 
>>>> -Nick
>>> 
>>>

Re: Shard Splitting API Proposal

Reply via email to