On 02/02/2011 10:37 AM, Andrew Kerr wrote:
> I reinstalled the two replicas that were saying "No such object" and now they 
> work - same exact cut-and-paste process that didn't work before.
>
> The good news is that I am back up and running (phew, what a morning!).
>
> I left one replica on 1.2.7.5, disabled behind our load balancer, so it is 
> getting replicated to but no production traffic - with the intent of helping 
> figure out what the problem is before others find it.  I'll get a bug report 
> filed since this seems like something new.
>
> FYI, these are all virtual machines (on a mix of vmware, kvm, and xen 
> depending on the datacenter) and have very minimal installs, running no other 
> apps, with no selinux or anything either.
Is the 1.2.7.5 server still crashing?  If so, please post the last few 
lines of the errors log before the crash.

See also here: 
http://directory.fedoraproject.org/wiki/FAQ#Debugging_Crashes
> -----Original Message-----
> From: 389-users-boun...@lists.fedoraproject.org 
> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Andrew Kerr
> Sent: Wednesday, February 02, 2011 11:44 AM
> To: Rich Megginson; General discussion list for the 389 Directory server 
> project.
> Subject: Re: [389-users] 1.2.7.5 process disappearing, replication failing
>
> The process is completely gone.  Doesn't show up in ps, and the pid 
> referenced in the pid file doesn't exist.
>
> I do have a lot of lines like this in my access log:
> [02/Feb/2011:10:05:06 -0500] conn=4479 op=-1 fd=161 closed - B1
>
> On the positive side, I was able to get some of the replicas downgraded to 
> 1.2.4.  I had been deleting the server from the site under netscaproot and 
> re-registering, but I hadn't re-created the replication agreement, I was just 
> re-initializing the existing one.  Deleting it and creating a new one got rid 
> of the error: "Unable to parse the response to the startReplication extended 
> operation.  Replication is aborting".
>
> Four of the six systems I put back to 1.2.4 (by removing the RPMs and blowing 
> away all dirsrv relics left behind, reinstalling, and re-configuring).  Two 
> of them I initialize and can see the directory, but when I do an ldapsearch 
> remotely I get "result: 32 No such object".  More random/unpredictable 
> behavior...
>
>
> -----Original Message-----
> From: Rich Megginson [mailto:rmegg...@redhat.com]
> Sent: Wednesday, February 02, 2011 11:10 AM
> To: General discussion list for the 389 Directory server project.
> Cc: Andrew Kerr
> Subject: Re: [389-users] 1.2.7.5 process disappearing, replication failing
>
> On 02/02/2011 09:06 AM, Andrew Kerr wrote:
>> I'm running a single master with 13 replicas, all CentOS 5.5.  The master, 
>> and a few of the slaves, are running 1.2.7.5.  We were previously on 1.2.4, 
>> with most replicas still on that version.
> You might be running into https://bugzilla.redhat.com/show_bug.cgi?id=668619
> The symptom of that bug is your server will just stop responding to
> requests, including server-to-server requests like replication.  Your
> server will still be running.
>
> Does ps -ef|grep slapd show your server process is running?
> Do you see the messages like "op=-1 fd=66 closed - T2" in your access log?
>> All of a sudden, the 1.2.7.5 replicas slapd process had just started to 
>> disappear.  Nothing in the error log with level at 8192.  Its just gone.  I 
>> can start it up and it'll last about 5 minutes.  Replication is what seems 
>> to be breaking - it seems to go away right after an update.
>>
>> I've tried rolling the replicas back to 1.2.4, but when I initialize the 
>> consumers I get "Unable to parse the response to the startReplication 
>> extended operation.  Replication is aborting".
>>
>> Any suggestions on where to go from this point?  It seems 1.2.7.5 is HIGHLY 
>> unstable.  But it seems it can't initialize 1.2.4 replicas (??), or maybe it 
>> just doesn't work at all.
>>
>> I'm not sure what the safe way is to roll back the master from 1.2.7.5, can 
>> I use "yum downgrade" safely?  At least now my  master and the replicas on 
>> 1.2.4 are working, I don't want to risk completely taking down ldap.
>>
>> Is there a good stable version I ought to be at?  I upgraded from 1.2.4 
>> because of a number of other bugs, although none of them as bad as 1.2.7.5 
>> seems to be.
>>
>> Thanks - any help is greatly appreciated.
>>
>> This message and the information contained herein is proprietary and 
>> confidential and subject to the Amdocs policy statement,
>> you may review at http://www.amdocs.com/email_disclaimer.asp
>> --
>> 389 users mailing list
>> 389-us...@lists.fedoraproject.org
>> https://admin.fedoraproject.org/mailman/listinfo/389-users
> --
> 389 users mailing list
> 389-us...@lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Reply via email to