[ 
https://issues.apache.org/jira/browse/COUCHDB-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Newson closed COUCHDB-2236.
----------------------------------

    Resolution: Cannot Reproduce

> Weird _users doc conflict when replicating from 1.5.0 -> 1.6.0
> --------------------------------------------------------------
>
>                 Key: COUCHDB-2236
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2236
>             Project: CouchDB
>          Issue Type: Bug
>      Security Level: public(Regular issues) 
>            Reporter: Isaac Z. Schlueter
>
> The upstream write-master for npm is a CouchDB 1.5.0.  (Since it is locked 
> down at the IP level, we're not at risk to the DOS fixed in 1.5.1.)
> All PUT/POST/DELETE requests are routed to this master box, as well as any 
> request with `?write=true` on the URL.  (Used for cases where we still do the 
> PUT/409/GET/PUT dance, rather than using a custom _update function.)
> This master box replicates to a replication hub.  The read slaves all 
> replicate from the replication hub.  Both the /registry and /_users databases 
> replicate continuously using a doc in the /_replicator database.
> As I understand it, since replication only goes in one direction, and all 
> writes to go the upstream master, conflicts should be impossible.
> We brought a 1.6.0 read slave online, version 1.6.0+build.fauxton-91-g5a2864b.
> On this 1.6.0 read slave (and only there), we're seeing /_users doc 
> conflicts, and it looks like it has a different password_sha and salt.  Here 
> is one such example: https://gist.github.com/isaacs/63f332a15109bbfdb8ac  
> (actual passowors_sha and salt mostly redacted, but enough bytes left in so 
> that you can see they're not matching.)
> A few weeks ago, this issue popped up, affecting about 400 user docs, and we 
> figured that it had to do with some instability or human error at the time 
> when that box was set up.  We deleted all of the conflicts, and verified that 
> all docs matched the upstream at that time.  We removed the /_replicator 
> entries, and re-created them using the same script we use to create them on 
> all the other read slaves.
> If this was just one or two docs, or happening across more of the read 
> slaves, I'd be more inclined to think that it has something to do with a 
> particular user, or our particular setup.  However, the /_replicator docs are 
> identical in the 1.6.0 box as on the other read slaves.  This is affecting 
> about 150 users, and only on that one box.
> We've taken the 1.6.0 read slave out of rotation for now, so it's not an 
> urgent issue for us.  If anyone wants to log in and have a look around, I can 
> grant access, but I hope that there's enough information here to track it 
> down.  Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to