Re: [DISCUSS] Per-doc access control

Jan Lehnardt Sun, 07 Jul 2019 05:01:29 -0700

Hi all,

I have a first RFC up here: 
https://github.com/apache/couchdb-documentation/pull/424


The goal of this is to have a concrete scope of changes written
down and have it be a minimal set to implement while being useful.

Specifically, this proposal defers to later:

* per-access views
* differentiation between read and write access for documents
* sharing individual documents between multiple users or groups

The feature design so far aims to allow the addition of the above
at a later point. Specifically, per-access views and read/write
differentiation should not be too hard to add, but doc sharing
might better be left for a FoundationDB future.

I’d like a thorough review and comments that specifically address
any holes in the proposal. In particular, I’m interested to hear
if there are any showstoppers in there that a common db-per-user
setup could not migrate over to this model.

Best
Jan
—


> On 4. Apr 2019, at 10:45, Jan Lehnardt <j...@apache.org> wrote:
> 
> Thanks for your initial comments.
> 
>> On 3. Apr 2019, at 23:07, Adam Kocoloski <kocol...@apache.org> wrote:
>> 
>> I’m also in favor of dropping Scenario 3.
>> 
>> One topic we may have discussed in the past but I wanted to close out here: 
>> in the relational database world it’s not uncommon to use materialized views 
>> as an access control mechanism to selectively expose contents of a table to 
>> clients who cannot access the table directly. Does the current thinking on 
>> _access for views support that use case? Can we build a view using a set of 
>> roles inherited from the user who created the design doc, but then turn 
>> around and set the _access on the view itself to a less-restrictive set?
> 
> 3 minutes thinking it over didn’t reveal any particular problems with this 
> feature, aside from include_docs not working as expected, which might be an 
> okay trade-off for now. But could be included later.
> 
> 
>> On the _revs_diff topic — I’m not all that concerned about users trying to 
>> guess revision IDs that exist on the server, and then reverse-engineer the 
>> contents of the existing revisions. Maybe I ought to be.
> 
> I’m not particularly worried, but it is at least a theoretical situation 
> where our user’s can be caught with their pants down when they didn’t expect 
> it. All I want to make sure is to document this properly. C.f. git where if 
> you get access to a repo, you get the whole history, not just the state from 
> where you started having access.
> 
> Best
> Jan
> —
> 
>> 
>> On a somewhat-related note, I have had conversations before with folks who 
>> are keen to adopt these sorts of fine-grained access control systems who 
>> said they actually prefer to have a 403 Forbidden response list the set of 
>> privileges that would be sufficient to access the resource. I found this 
>> surprising, but I guess it comes down to a user needing to figure out what 
>> kind of security exception to apply for in order to make progress with some 
>> data analysis. I think this is a topic on which we could make a fairly 
>> late-binding decision — or even have it as a configurable option.
>> 
>> I could definitely see the base Scenario 1 (single _access labels) landing 
>> ahead of the more-complex sharing models.
>> 
>> I haven’t had a chance to take a deep look at the code but the design seems 
>> good and thoughtful, and I definitely like the focus on the use cases.
>> 
>> Adam
>> 
>>> On Mar 14, 2019, at 11:21 AM, Jan Lehnardt <j...@apache.org> wrote:
>>> 
>>> My replies now inline.
>>> 
>>>> On 14. Mar 2019, at 16:13, Jan Lehnardt <j...@apache.org> wrote:
>>>> 
>>>> I received some notes privately from Gregor Martynus, which I’m 
>>>> reproducing here in email thread form. This email is all Gregor’s notes, 
>>>> my next email is my replies to them.
>>>> 
>>>>> On 10. Mar 2019, at 15:51, Jan Lehnardt <j...@apache.org> wrote:
>>>>> 
>>>>> Hey all,
>>>>> 
>>>>> after mulling this over some more, I’d like to tackle the detailed API 
>>>>> and behaviour for this. Especially how _access work in conjunction with 
>>>>> existing access control features.
>>>>> 
>>>>> My guiding principles so far are:
>>>>> 
>>>>> 1. Make the API intuitive, things should work like they look like they 
>>>>> should work like.
>>>>> 2. The default should never be that a resources is accidentally left 
>>>>> accessible to the public.
>>>>> 3. This should work as a natural extension to the existing security 
>>>>> features*.
>>>>> 
>>>>> * I’d be up for reworking the whole lot, too, but that might be a better 
>>>>> discussion for > 4.0.
>>>>> 
>>>>> 
>>>>> ## Database Creation and Default Behaviours
>>>>> 
>>>>> Creating a database with _access features is, as mentioned before done 
>>>>> via a flag to PUT /database?access=true
>>>>> 
>>>>> In a 3.0 world where this would land, we already agreed that databases 
>>>>> should be admin-only by default (instead of world read/writeable today). 
>>>>> This is a sensible default, but that leaves us with an _access enabled 
>>>>> database that can’t be used by anyone by server or db admins. Not very 
>>>>> useful.
>>>>> 
>>>>> To allow arbitrary users to use the db, I suggest we use the existing 
>>>>> _security system: i.e. if a user or a group a user belongs to is 
>>>>> mentioned in either `admins` or `members` inside of _security, they can 
>>>>> proceed and create documents on the db. This puts a second step burden on 
>>>>> the application developer, but it slots cleanly into the existing 
>>>>> security mechanisms, and doesn’t require special case handling. 
>>>>> Alternatively, we could define that _security isn’t available in _access 
>>>>> enabled databases, but that’s something I’d like to avoid if at all 
>>>>> possible.
>>>>> 
>>>>> In order to make it easy to specify that “everyone in _users” should be 
>>>>> able to use the db, I suggest we add a new role `_users` that is valid 
>>>>> inside _security, which means “everyone in /_users” (this only excludes 
>>>>> server admins which have full access anyway).
>>>>> 
>>>>> * * *
>>>>> 
>>>>> 
>>>>> ## Document Creation and Access Control
>>>>> 
>>>>> Next, one of our non-admin users creates a doc. There are multiple 
>>>>> options as to how we store the _access information.
>>>>> 
>>>>> 1. Automatically translate the userCtx.name of a doc creation (not an 
>>>>> update) into the first element of the _access array. E.g. user_a PUT 
>>>>> /db/doc {"a":1} creates this doc: {"a":1,"_access":["user_a"]}. This is a 
>>>>> little bit counter-intuitive.
>>>>> 
>>>>> 2. We require that a user puts "_access":["user_a"] in themselves. This 
>>>>> is an explicit granting of access permissions on doc creation and I think 
>>>>> is preferable.
>>>> 
>>>> I prefer being explicit.
>>>> 
>>>> 
>>>>> 
>>>>> This leaves the edge case of docs that have no _access member: so far I 
>>>>> thought those docs are admin-only, with maybe a db-wide option to swap 
>>>>> the default to public access, but I think given the explicitness of 2. we 
>>>>> can do better: require _access for all new doc creations in 
>>>>> access-enabled databases. A user can not create a new document without an 
>>>>> _access field that is an array that has at least one member. For public 
>>>>> documents, we could invent a new role _public, and admin-only docs could 
>>>>> use the existing role _admin.
>>>>> 
>>>>> The one downside to this approach is that we won’t be able to replicate 
>>>>> existing databases into an access-enabled database without modifying all 
>>>>> documents. This might be a worthwhile trade-off, but we should make that 
>>>>> decision consciously and document it well.
>>>> 
>>>> We could also provide tooling for migrations?
>>> 
>>> I’d love tooling, but we’d have to make sure we can do it correctly for a 
>>> big number of use-cases. For the acceptance of this change, I’d make 
>>> “documenting a migration path for db-per-user setups” a MUST have, and any 
>>> code that helps with that a nice to have.
>>> 
>>>> 
>>>> 
>>>>> We could allow for a special case where an _admin user can create docs 
>>>>> that have no _access field, and those docs are treated as having only the 
>>>>> _admin role in _access. So at least we could replicate all data in, but 
>>>>> then require a manual step to update all docs to say, migrate an existing 
>>>>> db-per-user app, while not accidentally exposing any docs to folks that 
>>>>> shouldn’t read them.
>>>>> 
>>>>> For the rest of cRUD, the existing document must store one of the RUD-ing 
>>>>> user’s name or role in its _access field.
>>>>> 
>>>>> For both creations and updates, a user MUST supply at least one role they 
>>>>> belong to or their own username.
>>>>> 
>>>>> * * *
>>>>> 
>>>>> 
>>>>> ## _revs_diff
>>>>> 
>>>>> /db/_revs_diff can answer the question of which revisions of a document 
>>>>> do NOT exist on a replication target: 
>>>>> http://docs.couchdb.org/en/stable/api/database/misc.html#db-revs-diff
>>>>> 
>>>>> This would allow users to specify ids and rev(s) for docs they don’t have 
>>>>> access too (anymore), so the result schema should be expanded to handle 
>>>>> id: unauthorized or somesuch, something the replicator needs to know what 
>>>>> to do with, if it encounters it (say a user got removed from the _access 
>>>>> list inbetween the replicator opening _changes and requesting the doc).
>>>>> 
>>>>> The _revs_diff implementation would have to altered to send an 
>>>>> unauthorized token for each doc the requesting userCtx has no access to. 
>>>>> If we can re-use some of our existing indexes, or any other performance 
>>>>> optimisation, that’d be great. I haven’t looked at that code at all, yet.
>>>>> 
>>>>> An important side-effect of this is, once a user has been added to a 
>>>>> doc’s _access list, they get access to “the full history of the doc”, 
>>>>> even before they had access. Of course, in CouchDB this means only 
>>>>> getting access to the rev ids, and not the content, but since they are 
>>>>> content-addressable hashes, a user could brute-force themselves into 
>>>>> revealing certain real values from earlier incarnations of the doc. I’d 
>>>>> rather not track _access per document revision in perpetuity, so this is 
>>>>> something we have to be very up-front about.
>>>>> 
>>>>> * * *
>>>>> 
>>>>> 
>>>>> ## Partitioned Databases
>>>>> 
>>>>> I mentioned partitioned databases in my previous mail, and I think it is 
>>>>> something we can document that end-users can opt into, but doesn’t 
>>>>> require any special casing on the _access proposal. That is, if users 
>>>>> start prefixing their doc ids with a user name or id and enable both 
>>>>> _access and partitions, then they get all the benefits of a partitioned 
>>>>> database, and if they choose not to, they don’t, but things keep working. 
>>>>> There are enough use-cases to warrant both behaviours.
>>>>> 
>>>>> * * *
>>>>> 
>>>>> 
>>>>> ## Scenarios that _access should help with.
>>>>> 
>>>>> Overall, we developed _access to allow users to stop using the 
>>>>> db-per-user architecture, but once we have per-doc-access control, folks 
>>>>> might start using this for all manner of things. We should be clear about 
>>>>> which scenarios we support and which we don’t.
>>>>> 
>>>>> 
>>>>> ### Scenario 1: db-per-user
>>>>> 
>>>>> In this scenario, _access enabled databases, the only way to allow 
>>>>> mutually untrusting users to store data in a part of CouchDB that only 
>>>>> they (and admins) have access to was giving each user their own database.
>>>>> 
>>>>> In an _access enabled database, users can 
>>>>> CRUD/_changes/_all_docs/_revs_diff their own docs knowing no other user 
>>>>> (aside from admins) can access those docs.
>>>>> 
>>>>> This is the simplest scenario, as all we’d have to track the owner of a 
>>>>> document and produce by-access-id/seq indexes based on that owner.
>>>>> 
>>>>> The current prototype implementation mostly reflects this stage. Not 
>>>>> saying this is what we should ship, but it is the easiest do implement 
>>>>> and explain.
>>>>> 
>>>>> Aside, I might be able to be persuaded to ship this as a 2.x feature, to 
>>>>> help those folks who don’t need anything else.
>>>>> 
>>>>> 
>>>>> ### Scenario 2: db-per-user + Sharing
>>>> 
>>>> One scenario we should address is how stopping to share would work when 
>>>> documents are continuously replicated, e.g. to a client for offline usage. 
>>>> My understanding is that for the person who’s access to documents got 
>>>> revoked does not get _changes update telling them that their access got 
>>>> removed, it would be up to the application developer to implement some 
>>>> kind of "notification" meta documents. Unless you have a better idea?
>>> 
>>> Since we now have a purge API as well, we could treat an un-share as a 
>>> purge for clients, and they can decide what to do with it.
>>> 
>>> Alternatively, we need to make breaking changes to _changes feed, maybe we 
>>> can hide that behind an opt-in flag, like “/db/_changes?access=true”, and 
>>> then we can send new rows like:
>>> 
>>> {seq: XYZ, id: abc, rev:4-YYY, _revoked: true} or somesuch.
>>> 
>>> 
>>>> 
>>>>> 
>>>>> The second we allow per doc auth, users will want to share those docs 
>>>>> with other users. That’s why we initially suggested the _access field be 
>>>>> an array, so other users and groups can be specified to have access. 
>>>>> There are multiple scenarios in this one alone:
>>>>> 
>>>>> #### 2.1: The Todo List
>>>>> 
>>>>> In this scenario, a user has a reasonable amount of ”personal data” that 
>>>>> they want to selectively share with one or more other users.
>>>>> 
>>>>> #### 2.2: The Chat/Forum/Newsgroup
>>>>> 
>>>>> In this scenario, a user wants to share any number of documents with a 
>>>>> reasonable number of groups. However, since we need to limit the number 
>>>>> of groups a user belongs to (currently 10, see below for details), this 
>>>>> might actually not be a great solution. Or folks couldn’t be in more than 
>>>>> 10 chat groups at a time.
>>>>> 
>>>>> #### 2.3: The Corporate Hierarchy
>>>>> 
>>>>> In this scenario, users want to share any number of docs with a 
>>>>> reasonable number of groups in a top-down/bottom-up fashion. Think CEO 
>>>>> shares with executives, execs share with divisions, divisions report up 
>>>>> to their one executive, etc.
>>>>> 
>>>>> 
>>>>> ### 3: Multiple Apps
>>>>> 
>>>>> The preceding scenarios all assume that a single application is 
>>>>> responsible for everything. However, once we allow mutually distrusting 
>>>>> users into a single database *and* make each per-user slice work (almost) 
>>>>> like a full standalone CouchDB database, what would stop users from using 
>>>>> this for a multi-homing feature, where different applications are used 
>>>>> for each user in the same database?
>>>>> 
>>>>> I’ll be referring to these scenarios down the line.
>>>>> 
>>>>> * * *
>>>>> 
>>>>> 
>>>>> ## Design Docs
>>>>> 
>>>>> ### Admin
>>>>> 
>>>>> One of the downsides of db-per-user is managing design docs in the face 
>>>>> of a changing application, that is, how to distribute new design docs 
>>>>> across 10s of 1000+s of user dbs? It’s not impossible, but tedious. In 
>>>>> all scenarios above but scenario 3., we could simplify this 
>>>>> significantly. Say an admin creates a design doc, and gives all users in 
>>>>> the db access to this design doc (this could be with the _users role, or 
>>>>> yet another new role _members, if we need it), requesting the result of a 
>>>>> view defined in that design doc will produce an index that is powered by 
>>>>> the requesting user’s by-access-seq index section(s).
>>>>> 
>>>>> N.B., this would require us to change a fundamental assumption when doing 
>>>>> the association between a design doc’s definition and index: normally, 
>>>>> there is only the `views` member that is hashed and that hash is used as 
>>>>> the index’s filename. Because there is only by-seq to power a view, that 
>>>>> all works. But now that we have an arbitrary set of sections on 
>>>>> by-access-seq, any view index built will have to take a user’s name and 
>>>>> roles into account. When a user leaves a group, or gains a group, all 
>>>>> indexes for that user will no longer be valid and need rebuilding.
>>>>> 
>>>>> 
>>>>> ### User
>>>>> 
>>>>> In any of the scenarios above, but especially 3., there could be 
>>>>> legitimate per-user design docs, so how should those be treated in an 
>>>>> _access enabled database?
>>>>> 
>>>>> The significant fields in a design doc are `views`, `validate_doc_update` 
>>>>> and `filters` (I’ll skip over the deprecated _show, _list, and _update).
>>>>> 
>>>>> The easiest to handle is a `filters`: if a user specifies a filter for a 
>>>>> _changes request or replication that lives in a design doc they don’t 
>>>>> have access to, they get an error, similar to if they specify a 
>>>>> non-existent design doc, just with `unauthorized` instead of `not_found`.
>>>>> 
>>>>> Next `views` is also not very hard to imagine working: just like globally 
>>>>> defined views for that db, the index is built for each user based on the 
>>>>> user’s name and roles.
>>>>> 
>>>>> More troubling are `validate_doc_update` functions: One, they are already 
>>>>> troubling in that they slow down any document updates. Two, if we now 
>>>>> import an existing db-per-user scenario where each user has their own 
>>>>> design docs,
>>>> 
>>>> I can’t think of a db-per-user scenario where each user DB would have a 
>>>> different validate_doc_update method? It would be the same method with 
>>>> access to the user context, the DBs security setting and the document, so 
>>>> it would act differently for different users, but using the same code.
>>> 
>>> They wouldn’t be different, but if we were do replicate 1000 db-per-user 
>>> design docs into a single database, as per today’s semantics, we’d have to 
>>> run 1000 VDUs on each doc update.
>>> 
>>>> 
>>>>> how should we apply validate_doc_update functions? 10s of 1000s of VDUs 
>>>>> are impractical to apply on each doc update, let alone just the 
>>>>> management of VDUs that are active on a database. One option would be to 
>>>>> ignore VDUs if they are not defined globally (say with a _members role). 
>>>>> But especially in scenario 3. this becomes problematic, but even without 
>>>>> that specific scenario, this violates the no surprises best practice.
>>>>> 
>>>>> We could say:
>>>>> 
>>>>> a) we don’t support scenario 3.
>>>> 
>>>> +1, I think it would make our lives easier in general if we don’t 
>>>> recommend to share the same CouchDB for multiple apps. At least I don’t 
>>>> see a reason to do that at this point.
>>> 
>>> I think I like this best, too, but I’d like to hear from others as well.
>>> 
>>> 
>>> Best
>>> Jan
>>> —
>>>> 
>>>>> b) we find a complicated but efficient way to apply only those VDUs that 
>>>>> are defined in design docs the writing user has access to plus any global 
>>>>> ones (this would be neat but rather complicated and potentially still 
>>>>> impractical from a performance perspective for N users).
>>>>> c) we could store all per-user design docs, but ignore them completely, 
>>>>> VDUs, views and filters.
>>>>> 
>>>>> I think I currently fall on the side of not supporting scenario 3. and 
>>>>> asking folks who migrate db-per-user to de-duplicate design docs and keep 
>>>>> them per-app. I believe that is a good trade-off between the most common 
>>>>> scenarios for db-per-user while keeping the implementation manageable. 
>>>>> Globally accessible design docs would show up in a user’s changes feed 
>>>>> and would replicate down to say a PouchDB application which might be the 
>>>>> exclusive user of those design docs.
>>>>> 
>>>>> In practice this would mean, a document that has an _id that starts with 
>>>>> _design/ will have to be produced by a database admin. Luckily, that’s 
>>>>> already the case. We should just make sure that folks don’t give db-admin 
>>>>> access to all users habitually.
>>>>> 
>>>>> 
>>>>> ## Read and Write Access
>>>>> 
>>>>> Speaking of validate_doc_update, it is used for two things: checking 
>>>>> document schema and doc update authorisation.
>>>>> 
>>>>> Once we allow access to a document with an _access field, we need to 
>>>>> decide what kind of access this gives to a doc: read-only or read-write 
>>>>> (I’m not considering write-only because for anything but doc creations 
>>>>> this is not useful as you need access to the current _rev).
>>>>> 
>>>>> However, when we look at implementing an application on top of our 
>>>>> existing API, it is already weird that read access can be controlled 
>>>>> globally (or with _access on a per doc level), but write access requires 
>>>>> writing JavaScript code. I think it would be a reasonable expectation for 
>>>>> users to expect a per-doc read/write permission granting.
>>>> 
>>>> Yes!
>>>> 
>>>>> 
>>>>> So we could have all of the above, but with two extra fields: 
>>>>> _access_read and _access_write, or _access: {read: [], write: []}
>>>> 
>>>> I prefer this API for its compactness, thinking about offline 
>>>> synchronization. The smaller the docs, the better.
>>>> 
>>>> Best
>>>> “Gregor”
>>>> —
>>>> 
>>>> 
>>>>> or we overload user and group names: _access: [user_a:read, user_b:write] 
>>>>> (or any permutation thereof). Overloading can cause trouble with 
>>>>> naturally occurring characters in group names.
>>>>> 
>>>>> The former seems more explicit, but from an API perspective that’s a 
>>>>> little more awkward: remember that we currently have an arbitrary limit 
>>>>> of 10 members in a user’s role array, to avoid excessive fan out on 
>>>>> cluster-internal operations. Partitioned dbs could get away with more, 
>>>>> more easily however. If we allow the specification of access control in 
>>>>> two lists, and one of the lists implies membership in the other, we have 
>>>>> a total limit of 10 members across both arrays. Or we limit 5 + 5, but 
>>>>> that seems excessive, while 10 total seems weird, but doable. Anyway, 
>>>>> good bikeshed.
>>>>> 
>>>>> 
>>>>> * * * 
>>>>> 
>>>>> 
>>>>> So far. I think all of the problems outlined are solvable, if with a 
>>>>> clear definition of what use-cases we do not support with access. If you 
>>>>> have more scenarios than the ones I outlined, please add them and we can 
>>>>> see if they cause any additional trouble.
>>>>> 
>>>>> Thanks for reading this far and I’m looking forward to your feedback.
>>>>> 
>>>>> 
>>>>> Best,
>>>>> Jan “_access” Lehnardt
>>>>> —
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 17. Feb 2019, at 15:25, Jan Lehnardt <j...@apache.org> wrote:
>>>>>> 
>>>>>> Hi Everyone,
>>>>>> 
>>>>>> I’m happy to share my work in progress attempt to implement the per-doc 
>>>>>> access control feature we discussed a good while ago:
>>>>>> 
>>>>>> https://lists.apache.org/thread.html/6aa77dd8e5974a3a540758c6902ccb509ab5a2e4802ecf4fd724a5e4@%3Cdev.couchdb.apache.org%3E
>>>>>>  
>>>>>> <https://lists.apache.org/thread.html/6aa77dd8e5974a3a540758c6902ccb509ab5a2e4802ecf4fd724a5e4@%3Cdev.couchdb.apache.org%3E>
>>>>>> 
>>>>>> You can check out my branch here:
>>>>>> 
>>>>>> https://github.com/apache/couchdb/compare/access?expand=1 
>>>>>> <https://github.com/apache/couchdb/compare/access?expand=1>
>>>>>> 
>>>>>> It is very much work in progress, but it is far enough along to warrant 
>>>>>> discussion.
>>>>>> 
>>>>>> The main point of this branch is to show all the places that we would 
>>>>>> need to change to support the proposal.
>>>>>> 
>>>>>> Things I’ve left for later:
>>>>>> 
>>>>>> - currently only the first element in the _access array is used. Our 
>>>>>> and/or syntax can be added later.
>>>>>> - building per-access views has not been implemented yet, couch_index 
>>>>>> would have to be taught about the new per-access-id index.
>>>>>> - pretty HTTP error handling
>>>>>> - tests except for a tiny shell script 😇
>>>>>> 
>>>>>> Implementation notes:
>>>>>> 
>>>>>> You create a database with the _access feature turned on like so:  PUT 
>>>>>> /db?access=true
>>>>>> 
>>>>>> I started out with storing _access in the document body, as that would 
>>>>>> allow for a minimal change set, however, on doc updates, we try hard not 
>>>>>> to load the old doc body from the database, and forcing us to do so for 
>>>>>> EVERY doc update under _access seemed prohibitive, so I extended the 
>>>>>> #doc, #doc_info and #full_doc_info records with a new `access` attribute 
>>>>>> that is stored in both by-id and by-seq. I will need guidance on how 
>>>>>> extending these records impact multi-version cluster interop. And 
>>>>>> especially whether this is an acceptable approach.
>>>>>> 
>>>>>> https://github.com/apache/couchdb/compare/access?expand=1&ws=0#diff-904ab7473ff8ddd07ea44aca414e3a36
>>>>>> 
>>>>>> * * *
>>>>>> 
>>>>>> The main addition is a new native query server called 
>>>>>> couch_access_native_proc, which implements two new indexes by-access-id 
>>>>>> and by-access-seq which do what you’d expect, pass in a userCtx and 
>>>>>> retrieve the equivalent of _all_docs or _changes, but only including 
>>>>>> those docs that match the username and roles in their _access property. 
>>>>>> The existing handlers for _all_docs and _changes have been augmented to 
>>>>>> use the new indexes instead of the default ones, unless the user is an 
>>>>>> admin.
>>>>>> 
>>>>>> https://github.com/apache/couchdb/compare/access?expand=1&ws=0#diff-fbb53323f07579be5e46ba63cb6701c4
>>>>>> 
>>>>>> * * *
>>>>>> 
>>>>>> The rest of the diff is concerned with making document CRUD behave as 
>>>>>> you’d expect it. See this little demonstration for what things look like:
>>>>>> 
>>>>>> https://gist.github.com/janl/b6d3f7502aa20b7b9ab9d9dcb8e92497 
>>>>>> <https://gist.github.com/janl/b6d3f7502aa20b7b9ab9d9dcb8e92497> (I’m 
>>>>>> just noticing that there might be something wonky with DELETE, but 
>>>>>> you’ll get the gist #rimshot)
>>>>>> 
>>>>>> * * *
>>>>>> 
>>>>>> Open questions:
>>>>>> 
>>>>>> - The aim of this is to get as close to regular CouchDB behaviour as 
>>>>>> possible. One thing that is new however which would require all apps to 
>>>>>> be changed is that for an _access enabled database to include an _access 
>>>>>> field in their docs (docs with no _access are admin-only for now). We 
>>>>>> might want to consider on new document writes to auto-insert the 
>>>>>> authenticated user’s name as the first element in the _access array, so 
>>>>>> existing apps “just work”.
>>>>>> 
>>>>>> - Interplay with partitioned dbs: eschewing db-per-user is already a 
>>>>>> large boon if you have a lot of users, but making those per-user 
>>>>>> requests inside an _access enabled database efficient would be doubly 
>>>>>> nice, so why not use the username from the first question above and use 
>>>>>> that as the partition key? This would work nicely for natural users with 
>>>>>> their own docs that want to share them with others later, but I can 
>>>>>> easily imagine a pipelined use of CouchDB, where a “collector” user 
>>>>>> creates all new docs, an “analyser” takes them over and hand them to a 
>>>>>> “result” user for viewing. In that case, we’d violate the 
>>>>>> high-cardinality rule of partitions (have a lot of small ones), instead 
>>>>>> all docs go through all three users. I’d be okay with treating the later 
>>>>>> scenario as a minor use-case, but for that use-case, we should be able 
>>>>>> to disable auto-partitioning on db creation.
>>>>>> 
>>>>>> - building access view indexes for docs that have frequent _access 
>>>>>> changes, lead to many orphaned view indexes, we should look at an 
>>>>>> auto-cleanup solution here (maybe keep 1-N indexes in case folks just 
>>>>>> swap back and forth).
>>>>>> 
>>>>>> * * *
>>>>>> 
>>>>>> I’ll leave this here for now, I’m sure there are a few more things to 
>>>>>> consider.
>>>>>> 
>>>>>> I’d love to hear any and all feedback you might have. Especially if 
>>>>>> anything is unclear.
>>>>>> 
>>>>>> Best
>>>>>> Jan
>>>>>> —
>>>>> 
>>>>> -- 
>>>>> Professional Support for Apache CouchDB:
>>>>> https://neighbourhood.ie/couchdb-support/
>>>>> 
>>>> 
>>>> -- 
>>>> Professional Support for Apache CouchDB:
>>>> https://neighbourhood.ie/couchdb-support/
>>>> 
>>> 
>>> -- 
>>> Professional Support for Apache CouchDB:
>>> https://neighbourhood.ie/couchdb-support/ 
>>> <https://neighbourhood.ie/couchdb-support/>
> 
> -- 
> Professional Support for Apache CouchDB:
> https://neighbourhood.ie/couchdb-support/
> 

-- 
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/

Re: [DISCUSS] Per-doc access control

Reply via email to