On 02/02/2011 10:37 AM, Andrew Kerr wrote: > I reinstalled the two replicas that were saying "No such object" and now they > work - same exact cut-and-paste process that didn't work before. > > The good news is that I am back up and running (phew, what a morning!). > > I left one replica on 1.2.7.5, disabled behind our load balancer, so it is > getting replicated to but no production traffic - with the intent of helping > figure out what the problem is before others find it. I'll get a bug report > filed since this seems like something new. > > FYI, these are all virtual machines (on a mix of vmware, kvm, and xen > depending on the datacenter) and have very minimal installs, running no other > apps, with no selinux or anything either. Is the 1.2.7.5 server still crashing? If so, please post the last few lines of the errors log before the crash.
See also here: http://directory.fedoraproject.org/wiki/FAQ#Debugging_Crashes > -----Original Message----- > From: 389-users-boun...@lists.fedoraproject.org > [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Andrew Kerr > Sent: Wednesday, February 02, 2011 11:44 AM > To: Rich Megginson; General discussion list for the 389 Directory server > project. > Subject: Re: [389-users] 1.2.7.5 process disappearing, replication failing > > The process is completely gone. Doesn't show up in ps, and the pid > referenced in the pid file doesn't exist. > > I do have a lot of lines like this in my access log: > [02/Feb/2011:10:05:06 -0500] conn=4479 op=-1 fd=161 closed - B1 > > On the positive side, I was able to get some of the replicas downgraded to > 1.2.4. I had been deleting the server from the site under netscaproot and > re-registering, but I hadn't re-created the replication agreement, I was just > re-initializing the existing one. Deleting it and creating a new one got rid > of the error: "Unable to parse the response to the startReplication extended > operation. Replication is aborting". > > Four of the six systems I put back to 1.2.4 (by removing the RPMs and blowing > away all dirsrv relics left behind, reinstalling, and re-configuring). Two > of them I initialize and can see the directory, but when I do an ldapsearch > remotely I get "result: 32 No such object". More random/unpredictable > behavior... > > > -----Original Message----- > From: Rich Megginson [mailto:rmegg...@redhat.com] > Sent: Wednesday, February 02, 2011 11:10 AM > To: General discussion list for the 389 Directory server project. > Cc: Andrew Kerr > Subject: Re: [389-users] 1.2.7.5 process disappearing, replication failing > > On 02/02/2011 09:06 AM, Andrew Kerr wrote: >> I'm running a single master with 13 replicas, all CentOS 5.5. The master, >> and a few of the slaves, are running 1.2.7.5. We were previously on 1.2.4, >> with most replicas still on that version. > You might be running into https://bugzilla.redhat.com/show_bug.cgi?id=668619 > The symptom of that bug is your server will just stop responding to > requests, including server-to-server requests like replication. Your > server will still be running. > > Does ps -ef|grep slapd show your server process is running? > Do you see the messages like "op=-1 fd=66 closed - T2" in your access log? >> All of a sudden, the 1.2.7.5 replicas slapd process had just started to >> disappear. Nothing in the error log with level at 8192. Its just gone. I >> can start it up and it'll last about 5 minutes. Replication is what seems >> to be breaking - it seems to go away right after an update. >> >> I've tried rolling the replicas back to 1.2.4, but when I initialize the >> consumers I get "Unable to parse the response to the startReplication >> extended operation. Replication is aborting". >> >> Any suggestions on where to go from this point? It seems 1.2.7.5 is HIGHLY >> unstable. But it seems it can't initialize 1.2.4 replicas (??), or maybe it >> just doesn't work at all. >> >> I'm not sure what the safe way is to roll back the master from 1.2.7.5, can >> I use "yum downgrade" safely? At least now my master and the replicas on >> 1.2.4 are working, I don't want to risk completely taking down ldap. >> >> Is there a good stable version I ought to be at? I upgraded from 1.2.4 >> because of a number of other bugs, although none of them as bad as 1.2.7.5 >> seems to be. >> >> Thanks - any help is greatly appreciated. >> >> This message and the information contained herein is proprietary and >> confidential and subject to the Amdocs policy statement, >> you may review at http://www.amdocs.com/email_disclaimer.asp >> -- >> 389 users mailing list >> 389-us...@lists.fedoraproject.org >> https://admin.fedoraproject.org/mailman/listinfo/389-users > -- > 389 users mailing list > 389-us...@lists.fedoraproject.org > https://admin.fedoraproject.org/mailman/listinfo/389-users -- 389 users mailing list 389-us...@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users