On Wed, Jan 22, 2014 at 3:55 AM, Adam Kocoloski <kocol...@apache.org> wrote:
> Agree, great discussion here. I will say that I think the INI > configuration system is quite nice, and that I'd love to see couch_config / > config gain wider adoption as an alternative to using OTP application > environments (of course we should still make the INI file itself optional > by embedding the defaults in the code). I'd also say that the fabric.erl > interface has a number of warts that I'd love to clean up sometime. > What about having something like cuttlefish [1] from Basho. But instead to compile from a nginx like format to erlang config file we compile from ini files? In that case couchdb will have only to fetch the config using Erlang config system. - benoit [1] https://github.com/basho/cuttlefish > > In terms of priorities, I agree with Paul's summary: > > > I think we should be focusing on minimizing the amount of effort needed > to make the merge work. I think all of your proposals here are solid and > something we should pursue but I think they would make the merge more > complicated rather than simpler. > > The sooner we can get some of these repos into master the sooner we can > establish well-scoped topic branches for Benoit's great recommendations. > > Adam > > On Jan 21, 2014, at 3:07 PM, Russell Branca <chewbra...@apache.org> wrote: > > > Lot of great ideas in this thread. A number of the suggested items are > > feature and refactoring work, so I agree with Paul that we should > separate > > those from the merges themselves, and concentrate on them after we've got > > the base pieces merged. > > > > Fabric works well for an internal API on the clustered bits. It could > use a > > little work to make it more isolated from chttpd so that it's more > directly > > usable. I think it makes sense to discuss making a more uniform API that > > provides similar functionality for clustered vs non clustered operations, > > but this is outside the scope of the merge. > > > > As Paul mentioned, I've got a number of ideas for ripping out auth into > an > > isolated application that handles #user_ctx, roles, and user CRUD, which > I > > want to dive into more after the merges. > > > > > > -Russell > > > > > > On Tue, Jan 21, 2014 at 5:41 AM, Robert Samuel Newson < > rnew...@apache.org>wrote: > > > >> > >> To follow on from Paul’s responses, we will be deleting fabric_rpc2 > before > >> the merge. That exists only temporarily at Cloudant to ease upgrades, as > >> Paul notes, it has no place in the final source tree. > >> > >> B. > >> > >> On 21 Jan 2014, at 11:39, Paul Davis <paul.joseph.da...@gmail.com> > wrote: > >> > >>> Good points all around. Replying in line for context. > >>> > >>> On Tue, Jan 21, 2014 at 2:33 AM, Benoit Chesneau <bchesn...@gmail.com> > >> wrote: > >>>> Hi all, > >>>> > >>>> I was reviewing the bigcouch and the rcouch branch this morning and I > >>>> think it is about time to start a real merge. Waiting that each merges > >>>> reach a working version before starting any work on the final product > >>>> is quite unproductive to say less. > >>>> > >>>> On the contrary what I see in the current branches let me think, it > is a > >>>> good time to start some work to improve our code base in view of > >>>> achieving the 3 main goals we fixed a long time ago during the summit > >>>> and that were confirmed in December last year, ie.: > >>>> > >>>> - add a cluster facility to Apache CouchdDB > >>>> - allows Apache CouchDB to be embedded in other Erlang applications > >>>> (just like mnesia the standard database library in Erlang) > >>>> - make Apache CouchDB more *OTP*ish > >>>> > >>>> The changes in bigcouch are indeed quite more monolithic than the > rcouch > >>>> changes by adapting Apache Couchdb to be able to run in a cluster > >>>> environment. While the cluster management part is quite isolated > (mem3, > >>>> rexi, chttpd), others parts of bigcouch are wrapping or modifying the > >>>> apache couchdb internals: > >>>> > >>>> - fabric is adding fabric_rpc and fabric_rpc2 wrapping some modules > and > >>>> functions in the couchdb core to be able to call them on the cluster. > >>>> It should be noted that the cluster part is using fabric to do all > calls > >>>> to each couchdb nodes on the cluster and return/merge the results. > >>> > >>> For clarification the duplication of these modules is a side effect of > >>> running hot code upgrades. Conceptually fabric_rpc2 is the same as > >>> fabric_rpc, we just use the second module to do RPC API upgrades on > >>> live clusters. Not something to worry about extensively. > >>> > >>>> - some changes has been done to optimise the core: removing ref > >>>> counting, adding ets_lru to handle caching, db initialization has been > >>>> changed to look at the cluster to fetch ddocs to set validate funcs. > >>> > >>> The caching doesn't actually affect core. Its only used for clustered > >>> calls at the moment. The big thing here is definitely > >>> validate_doc_update functions being loaded lazily and the fact that > >>> single node code has to run clustered calls. I definitely agree that > >>> better internal APIs could probably clean this up. > >>> > >>>> - couch_replicator has been modified to work in a cluster > >>>> - bigcouch also changed the way the configuration is handled in > couchdb, > >>>> making it more evented. Which is better than the current version but > >>>> still rely on an INI file. > >>> > >>> We did change the API to enforce some basic patterns to avoid common > >>> pitfalls in hot code upgrades but the fundamental operation is still > >>> the same. Unfortunately, as you point out, this doesn't change our > >>> reliance on INI files. > >>> > >>>> - logs are now handled by twig a specific application created by > >>>> Cloudant for that purpose. > >>>> > >>> > >>> We're looking to switch to lager as well. Twig was written pre-Lager > >>> and we haven't made the switch because it hasn't been a priority yet. > >>> We've got Lager queued up as a task to complete before the final > >>> merge. > >>> > >>>> > >>>> rcouch changes consist mostly in the following: > >>>> > >>>> - making Apache CouchDB an OTP release. It also extracts some code and > >>>> revisits the supervision like couch_httpd to transform it as a > >>>> standalone application, and make couch_replicator a full app > >>> > >>> For the record I removed all of the BigCouch stuff related to Erlang > >>> release handling because we started the merge when autotools was still > >>> a requirement. We do Erlang releases exclusively at Cloudant and > >>> switching to this is now going to be part of the merge. We do have > >>> some differences in some of the build tools but the rcouch build > >>> system isn't significantly different than our internal builds. We'll > >>> switch to something much more rcouch like before the merge. > >>> > >>>> - add some features as standalone Erlang apps > >>>> - adding view changes: which edit couch_mrview, couch_index, > >>>> couch_replicator by adding new features without changing the current > >>>> one or internals > >>>> - changes the core by adding some optimizations and new features: > remove > >>>> ref counting, add new caching, add validate_doc_read. Actually > >>>> these optimizations have not been merge in view of using those in > >> bigcouch. > >>>> The collation has been replaced by a nif available as a standalone > >>>> Erlang application couch_collate. > >>> > >>> Some of these I've skimmed and others I haven't spent any time looking > >>> at. The couch_collate change isn't a huge issue. I may suggest some > >>> optimizations in the future but the general idea is solid. The caching > >>> and validate_doc_read changes I'll have to review when I see them in > >>> commits. I didn't notice anything related on 1994-merge-rcouch when I > >>> reviewed it last week but I was mostly skimming for major things. > >>> > >>>> - A lot of changes have been done in the build process > >>> > >>> Quite a lot. We're both rebar centric so I'm not too concerned. I'd > >>> like to see a bit more thought on some of the ICU and SpiderMonkey > >>> aspects but nothing here is overly controversial I don't think. Maybe > >>> some UI things could use some polish. > >>> > >>>> - logs are handled by lager an application created by Basho commonly > >>>> used these days in the Erlang world. > >>>> > >>> > >>> +1 > >>> > >>>> So rcouch and bigcouch are conflicting on the following right now: > >>>> > >>>> - couch_replicator > >>> > >>> I'm not sure I've seen what the actual changes in rcouch are. I didn't > >>> look too hard, but for the most part Cloudant changes are bug fixes > >>> and making the _replicator db work in a cluster. I wouldn't think > >>> there'd be major conflicts here. > >>> > >>>> - couch_index, couch_mrview > >>> > >>> This isn't a conflict because we don't use these internally yet. Work > >>> to start using this is queued up to happen in the next few weeks. > >>> > >>>> - internals: some parts of the code. the collation, jiffy changes are > >>>> not the same, ... . > >>> > >>> Collation and Jiffy are trivial changes. Jiffy I actually pulled out > >>> of BigCouch cause ejson was still a thing and I was trying to minimize > >>> the change. couch_collate looks mostly fine to me so I'm not overly > >>> concerned about that. The ref counting changes fou > >>> > >>>> - build process > >>>> - logging system > >>>> > >>>> > >>>> The goal of making couchdb embedabble in another Erlang application > >>>> is not achieved in bigcouch and still difficult with rcouch. > >>>> > >>>> - Difficult with rcouch because the configuration is actually strongly > >>>> based on settings passed via an ini file. > >>>> - Difficult with bigcouch because of the internal changes implying > the > >>>> usage by default of a cluster (like loading ddocs at initialization). > >>>> Also it has the same problem with the ini files. > >>>> - Both handle authorization at the core level instead of handling > >>>> it in a different layer mixing code between the internal api and the > >>>> HTTP API. > >>>> > >>>> If we are able to embed the couchdb core in any Erlang application, it > >>>> will considerably help us to merge bigcouch and rcouch quietly, > easily, > >>>> with people working in parallel to achieve it. Building a plugin will > be > >>>> a lot easier as well. The good thing is that both projects have the > >>>> roots to do it, it is a matter to merge some code and refactor some > >>>> internals: > >>>> > >>> > >>> I mostly agree with these comments. The reliance on INI files > >>> definitely prevents the ease of embedding the core storage engine. > >>> > >>> One thing, you misread the code for the default clustering a bit. The > >>> cluster assumption is only for database names that follow the pattern > >>> "shards/00000000-ffffffff/dbname.couch". Anything not following that > >>> pattern works just fine locally. Remember that we have the entire > >>> CouchDB single node test suite passing (minus a specific test or two > >>> for known reasons). We also use the single node APIs quite a bit for > >>> administration so its in our interest to make sure that continues to > >>> work. > >>> > >>> Definitely agree on the authorization issues. Russel Branca has > >>> recently been suggesting that we undo the core ties to user contexts > >>> for other reasons. I don't think this is an "if" so much as a "when" > >>> question. > >>> > >>>> - Have a clean internal API. Fabric the cluster library of bigcouch is > >>>> adding 2 modules wrapping the couchdb API (couch_db, couch_stream, > >>>> ...) `fabric_rpc` and `fabric_rpc2`. couch_httpd* modules are also > >>>> wrapping it to send the result to the HTTP clients. Both are adding > >>>> a layer over the internals just to be able to use the core which is > >> really > >>>> inefficient. It forces Erlang application that want to call directly > >>>> couchdb (like couchc [1]) to also wrap these internals to make them > >>>> useful. We should on the contrary offer a clean API at the core level > >>>> usable by fabric or any Erlang App. The API offered by the > `fabric_rpc*` > >>>> or `couchc` should be our internals. > >>>> > >>> > >>> I'm not immediately familiar with couchc but fabric.erl mostly > >>> provides a clean API for everything needed for normal client access. > >>> There have been general discussions on writing a single node version > >>> of that module for exactly the same reasons you state. For instance, > >>> it'd be nice to have either chttpd or couch_httpd. Having both is just > >>> a waste. If we had two modules that would switch back and forth > >>> between single node vs. clustered would make that significantly > >>> easier. > >>> > >>>> So I propose as a first merge action to refactor the internal API so > >>>> it can be used directly by fabric and future Erlang applications > >>>> embedding or calling directly couchdb. > >>>> > >>> > >>> I disagree with this priority. We've already got enough going on and > >>> there's really not that much conflict (so far as I can tell) with > >>> internal APIs. Adding a "fix the thing we haven't fixed in years" step > >>> sounds like a recipe for disaster. This is definitely something we > >>> should do going forward I just don't think its a precondition for the > >>> merge work. > >>> > >>>> - Make the auth* apart - remove all `#user_ctx{}` calls from > >>>> the couch_* modules- to handle the it in the transport or application > >>>> level. HTTP for now.By doing that we make the `couch` application > >>>> completely independent of the transport, so any erlang app can embed > it > >>>> and propose its own API to enter the data. (just like mnesia). It will > >>>> also remove all the extra code we have to force the authorization by > >>>> faking the user context when it needs to use the internal api. The > >>>> auth* should be provided as an extension imo. > >>>> > >>> > >>> I agree and Russel has ideas for things quite like this. Again not > >>> something I think should be a precondition for the merge. (Not sure if > >>> you're proposing the precondition part here though). > >>> > >>>> - removing validate_* initialization in the core db level and let it > >>>> to the transport or the application. For example it could be done when > >>>> opening the DB via the HTTP api by wrapping the open_db call. Some > >>>> applications may not need it at all. The core should still provide > >>>> hooks to do these actions, but the way the hooks are passed to the > >>>> database should be an application decision. couch_httpd is an > >>>> application exposing the couchdB API to other via HTTP. > >>>> > >>> > >>> I'll have to look closer at validate_doc_read to get an idea on this. > >>> At first I also think this is also in the "good idea but not a merge > >>> requirement" but that may change when reading the validate_doc_read > >>> implementation. We specifically rely on not requiring the functions > >>> for reads so I'll have to think harder on this. > >>> > >>>> - adding special databases hooks only on the application level. Right > >>>> now we have specific databases handled in the core code (`_users` and > >>>> and `_replicator`). We should instead pass this hook on another level, > >>>> like when we initialize the couch_replicator application for example. > >>>> > >>> > >>> I'm not sure what you mean here. I 100% agree that we should aim to > >>> remove the special logic for these databases in core. I'd prefer to > >>> see their "database likenesses" removed entirely but that's a > >>> different story for a different thread. > >>> > >>>> - making the configuration created by an INI file optional. We could > >> have > >>>> a default configuration set via the API that can be feed by later by > >>>> the ini file or any other config system. > >>>> > >>> > >>> I definitely agree that there's config improvements to be made. One of > >>> the things I'd like to see is defaults in code along with the > >>> possibility for enforcing type constraints (ie, reject updates that > >>> change an integer to a string or vice versa). > >>> > >>>> - merging any optimizations at the same time (ref counting > replacement). > >>>> > >>> > >>> These sorts of things I think we'll just have to look at on a > >>> case-by-case basis. The fact that both rcouch and BigCouch needed to > >>> rewrite these bits shows that its an issue. Its just a matter of > >>> making sure that the merge supports all the required use cases. > >>> > >>>> > >>>> Imo if we do the work above it will allows us to speed the merge of > >>>> both rcouch and bigcouch. The second step would be having the fabric > api > >>>> to use this clean api, and modifying couch_htpd and couch_replicator > to > >>>> use it. Bonus point, it would ease unitesting by isolating the cluster > >>>> part from the core. > >>>> > >>> > >>> While there are definitely some warts on the clustering APIs I would > >>> argue that 95% of the clustering code is isolated from core. There's > >>> definitely a lot more logic in fabric for clustering than the few > >>> places we need to call fabric from core. Though I totally agree that > >>> having 100% of core not care about fabric would be ideal. > >>> > >>> On the other hand, I disagree that any of these as a precondition for > >>> merging would speed the merge. They're all things we should be working > >>> towards and definitely things we should be thinking about while > >>> working on the merge, but nothing in here strikes me as a road block > >>> to merging. They help us clean up lots of things post-merge like > >>> having two extremely close HTTP layers but while that duplication > >>> stinks to high heavens, its worked well enough for 6+ years of > >>> production deployments. > >>> > >>>> I probably forgot some features that could also be done during the > >>>> merge, like merging couch_index and couch_mrview in one app or at > least > >>>> having `_all_docs` handled by couch_httpd but these changes can be > done > >>>> at another level imo. Also the `_all_dbs` is handled differently, in > >>>> bigcouh it is handled by recording all the dbs in a database, in > couchdb > >>>> we are right now only relying only on the fs. The bigcouch solution > >>>> should be the default imo. > >>>> > >>> > >>> These are also good things to consider. I'm definitely +1 on the > >>> couch_index/couch_mrview refactoring as well as moving _all_docs > >>> around. That said its definitely not a priority for the merge. We have > >>> two major forks to merge back into mainline. I'd prefer we focus on > >>> the minimal amount of effort required to make that happen so that we > >>> can then focus on some of these other bigger things. > >>> > >>> RIght now I think the focus should be exclusively on getting the three > >>> code bases merged back to master in a fashion that allows us to start > >>> iterating on all of the things you've listed. That means taking the IP > >>> cleared code we have and doing the legwork to get it all onto master > >>> in such a fashion that we can continue making progress. Once that's > >>> done I think we can start moving towards making changes to simplify > >>> internals and speed development. > >>> > >>> Or to phrase that slightly differently, I think we should be focusing > >>> on minimizing the amount of effort needed to make the merge work. I > >>> think all of your proposals here are solid and something we should > >>> pursue but I think they would make the merge more complicated rather > >>> than simpler. > >>> > >>>> > >>>> Anyway, I wish we can start this work ASAP rather than just working > on 2 > >>>> branches in parallel. > >>>> > >>> > >>> I definitely agree. I think the multi-repo approach will help us focus > >>> on what's actually in conflict and then we can move towards resolving > >>> those changes. I know the code bases look quite different at the > >>> moment but from all that I've read so far, these are mostly > >>> superficial issues. > >>> > >>> For instance, the rcouch build system is significantly different than > >>> the bigcouch version but that's due to me wiring the bigcouch build > >>> into autotools. Here we'll mostly default towards the rcouch system. > >>> We'll have to iron out some details but its nothing insurmountable. > >>> The index work in rcouch will mostly be pulled directly into bigcouch. > >>> At the merge we may not have 100% support for the newer features but > >>> that just means we may cut another single node release before we cut > >>> one that has API parity between clustered and single node. > >>> > >>>> Any feedback is welcome, > >>>> > >>>> - benoit > >>>> > >>>> [1] http://github.com/benoitc/couchc > >> > >> > >