Alfred Victor wrote: > I don't see a directive equivalent of SECURE_NFS to add to nfs.conf (all > documentation seems to still refer to the sysconfig path), or is it the > same? Can I just disable rpcgssd? We have no nfs mounts which are > kerberized yet, and disabling rpcgssd seems to solve our problem, and I > can kinit after disabling rpcgssd. It also does not seem that disabling > rpcgssd hurts running tasks or node, but would like to confirm it's > limited to nfs in function. I still have to wonder if there's a better way.
I suspect that SECURE_NFS is a no-op these days. It was necessary in RHEL 5/6 for sure. Yes, I think you can safely disable the service. In some versions of RHEL/Fedora IIRC this is symlinked to nfs-secure-server so mask that as well. rob > > Alfred > > On Tue, Jun 15, 2021 at 10:31 AM Alexander Bokovoy <aboko...@redhat.com > <mailto:aboko...@redhat.com>> wrote: > > On ti, 15 kesä 2021, Alfred Victor via FreeIPA-users wrote: > >Hi Rob, > > > >We attempted setting sec=sys on the mount, however to our surprise > found > >this didn't work. We then figured out that IPA install is adding > this to > >/etc/sysconfig/nfs: > > > >SECURE_NFS=yes > > > > > >We tried removing this to no avail and restarting all the related > sytstemd > >units (rpcgssd, nfs, etc). Any idea why sec=sys is being ignored? > Should we > >need to set SECURE_NFS=no? On non-IPA nodes this directive does not > exist > >at all. For now, I have also totally disabled rpcgssd as I think > this unit > >may be responsible ( it seems that it does the upcall in > >https://access.redhat.com/solutions/225783 ) so I will hope that this > >solves it, I don't believe anything else depends on rpcgssd but > will soon > >find out. Any suggestions please? :) > > Depends on nfs-utils version? I remember there had been a change in > configuration in upstream nfs-utils in 2019: > > commit c69875c8afdd877baf7139c0cd5241f70105cbd4 > Author: François Cami <fc...@redhat.com <mailto:fc...@redhat.com>> > Date: Tue Feb 26 13:59:06 2019 +0100 > > ipa-client-automount: handle NFS configuration file changes > > nfs-utils in Fedora 30 and later switched its configuration > file from /etc/sysconfig/nfs to /etc/nfs.conf, providing a > conversion service (nfs-convert.service) for upgrades. > However, for new installs the original configuration file > is missing. This change: > * adds a tuple-based osinfo.version_number method to handle > more kinds of OS versioning schemes > * detects RHEL and Fedora versions with the the new nfs-utils > behavior > * avoids backing up the new NFS configuration file as we do > not have to modify it. > > See: https://bugzilla.redhat.com/show_bug.cgi?id=1676981 > > Fixes: https://pagure.io/freeipa/issue/7868 > > > > > >Alfred > > > >On Thu, Jun 10, 2021 at 2:17 PM Rob Crittenden <rcrit...@redhat.com > <mailto:rcrit...@redhat.com>> wrote: > > > >> Alfred Victor wrote: > >> > Thanks very much Rob et al! I believe we have found our root > cause and > >> > the fix. If you like I'll provide some more details after we're > done > >> > with everything. > >> > >> Yes, knowing the cause would be great and could be helpful to others! > >> > >> cheers > >> > >> rob > >> > >> > > >> > Alfred > >> > > >> > On Thu, Jun 10, 2021 at 11:02 AM Rob Crittenden > <rcrit...@redhat.com <mailto:rcrit...@redhat.com> > >> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com>>> wrote: > >> > > >> > Alfred Victor wrote: > >> > > Hi all, > >> > > > >> > > Just curious if anyone has suggestions about that please > before I > >> get > >> > > going in a couple of hours with conversions to IPA again? > I did > >> > the math > >> > > and 1,097,471 log messages in 5 hours is about 60 times per > >> second, so > >> > > I'm gradually becoming more certain this is why we can > only boot > >> 20-30 > >> > > nodes at a time when we used to boot hundreds. However, > this is > >> still > >> > > just a guess as I don't know the mechanism behind why this > >> interferes > >> > > with IPA joins, some bottleneck with the KDC? > >> > > >> > It sure seems like this is a kerberized NFS request for a > host that > >> > doesn't provide it, or doesn't have an nfs principal. I > think you'd > >> need > >> > to monitor the offending client to see what it is doing. > >> > > >> > If the KDC is being dramatically slowed down by this then > yes, you > >> could > >> > see slower Apache performance because it needs to obtain > tickets on > >> > behalf of the user doing the join. Whether that would represent > >> itself > >> > as a read timeout I don't know. > >> > > >> > rob > >> > > >> > > > >> > > Alfred > >> > > > >> > > On Wed, Jun 9, 2021 at 3:49 PM Alfred Victor > <alvic...@gmail.com <mailto:alvic...@gmail.com> > >> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com>> > >> > > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com>>>> wrote: > >> > > > >> > > Hi Rob, > >> > > > >> > > I have reduced that timeout and will tune it further. > >> > Regarding ISE > >> > > errors, I think we can make the assumption that this is > >> > entirely an > >> > > issue of the web timeouts, I haven't seen any evidence > >> > otherwise and > >> > > will have another attempt at converting nodes > tomorrow, and > >> with a > >> > > keener eye of what to look for I can make a better > >> determination > >> > > then. I am most concerned over what the underlying > cause might > >> be > >> > > causing it to take too long and hit the timeout, and > don't > >> want to > >> > > engineer around this by changing Apache timeouts if > we can > >> instead > >> > > address the root cause. I am suspicious of krb log > messages > >> > flooding > >> > > our IPA systems about a service principal like so, > but not > >> sure if > >> > > this is gumming up the works (or even why this > message has > >> started > >> > > appearing since the rebuild): > >> > > > >> > > ./krb5kdc.log:Jun 09 10:38:51 redacted.redacted.com > <http://redacted.redacted.com> > >> > <http://redacted.redacted.com> <http://redacted.redacted.com> > >> > krb5kdc[31187](info): TGS_REQ (4 etypes {18 17 16 23}) > 10.1.1.27 > >> > <http://10.1.1.27>: LOOKING_UP_SERVER: authtime 0, > >> > host/redacted.redacted....@redacted.com for > >> > nfs/nfsserver.redacted....@redacted.com, Server not found in > >> > Kerberos database > >> > > > >> > > > >> > > Just in the last 5 hours alone, this log message and > others > >> > like it (main difference is just the nodename it > references) has > >> > appeared 1,097,471 times. Conceivably there is also some > log write > >> > locking or something going on that could be slowing IPA > down and > >> > leading to our symptom here? > >> > > > >> > > Alfred > >> > > > >> > > > >> > > On Wed, Jun 9, 2021 at 9:19 AM Rob Crittenden > >> > <rcrit...@redhat.com <mailto:rcrit...@redhat.com> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com>> > >> > > <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com> <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com>>>> > >> wrote: > >> > > > >> > > Alfred Victor wrote: > >> > > > Hi Rob, > >> > > > > >> > > > We did revert to 60s - I seem to remember some > ldapsearch > >> > > timing out > >> > > > previously but maybe we could still greatly > reduce this > >> with > >> > > no ill > >> > > > effect. However, we saw no change in join > success either > >> way > >> > > and I have > >> > > > not changed anything in Apache as I would need > to find > >> the > >> > > exact values > >> > > > in question and I think these directive changes > may get > >> lost > >> > > with an > >> > > > update? The timeout/ISE issues are new - we > previously > >> > booted > >> > > many nodes > >> > > > concurrently (400+) without problems but now it > happens > >> even > >> > > booting as > >> > > > few as 50 results in 10-20 timeouts or ISE. > This has > >> > been the > >> > > case at > >> > > > least since we rebuilt and re-replicated the > environment > >> by > >> > > swapping IPA > >> > > > members out of the old one, but maybe this is > unrelated. > >> > Is it > >> > > possible > >> > > > we landed on a newer version with different timeout > >> > values? Is it > >> > > > possible that IPA is under slightly more load > from the > >> > higher > >> > > number of > >> > > > nodes and has some bottleneck since we last did > >> conversions? > >> > > We have not > >> > > > been able to substantiate the latter by looking at > >> > > CPU/memory/io trends. > >> > > > What might we investigate next to see if we may > have > >> missed > >> > > some ongoing > >> > > > issue that could for instance cause locking > problems or > >> > > something else > >> > > > internally in IPA to explain our symptoms? > Also, I have > >> > > observed some > >> > > > nodes which report a successful installation of IPA > >> client, > >> > > but have in > >> > > > fact a lot of failures, for instance with > mounts not > >> > working from > >> > > > automount setup. We will need to try to > reproduce this to > >> > > understand > >> > > > what happened I think as we have already sorted > those > >> nodes > >> > > elsewhere > >> > > > and they have got going outside of IPA. > >> > > > > >> > > > I am interested that you suggested "at least > some of the > >> > > clients aren't > >> > > > connecting at all and increasing the timeout > could make > >> this > >> > > worse" - > >> > > > this might indicate some sort of network problem at > >> > play, but > >> > > as far as > >> > > > I am aware, everything is working absent IPA so > I do not > >> > > suspect it > >> > > > presently. > >> > > > >> > > A 60 second LDAP timeout is still way too big. > The default > >> > is 2. > >> > > Unless > >> > > you are seeing timeouts I'd suggest lowering it. > >> > > > >> > > I'm only seeing hints to what you're seeing and > the scope > >> so > >> > > it's hard > >> > > to make further suggestions. Are all the internal > errors > >> the > >> > > same, for > >> > > example? Are some failing due to LDAP timeouts or > are they > >> all > >> > > wsgi read > >> > > timeouts? > >> > > > >> > > The Apache request timeout can be configured in > httpd.conf > >> > which is > >> > > independent of the config files that IPA > currently writes > >> > so it > >> > > should > >> > > survive upgrades. > >> > > > >> > > rob > >> > > > > >> > > > Alfred > >> > > > > >> > > > On Mon, Jun 7, 2021 at 4:42 PM Rob Crittenden > >> > > <rcrit...@redhat.com <mailto:rcrit...@redhat.com> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com>> > >> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com>>> > >> > > > <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com> <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com>> > >> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com>>>>> wrote: > >> > > > > >> > > > Alfred Victor wrote: > >> > > > > Actually, no change happened from 300-> 600 > >> > timeout, the > >> > > web portal > >> > > > > itself gave me an ISE I hadn't noticed > when I tried > >> > > clicking save! > >> > > > > >> > > > I wasn't clear which log to look in. You'll see > >> details > >> > > about where the > >> > > > error is caught in IPA in the Apache log. > To see > >> LDAP > >> > > timeouts you look > >> > > > for err=3 in the 389-ds access log. > >> > > > > >> > > > But since you posted a traceback this time it's > >> > clear that > >> > > this is just > >> > > > Apache waiting for client data to read so > any tuning > >> you > >> > > do needs to be > >> > > > in Apache. You could try tuning Timeout which > >> > defaults to > >> > > 60 but this > >> > > > doesn't seem likely to help since at least > some of > >> the > >> > > clients aren't > >> > > > connecting at all and increasing the > timeout could > >> make > >> > > this worse. > >> > > > > >> > > > Please revert the searchtimeout. 300 seconds is > >> > orders of > >> > > magnitude > >> > > > too big. > >> > > > > >> > > > rob > >> > > > > >> > > > > > >> > > > > Alfred > >> > > > > > >> > > > > On Mon, Jun 7, 2021 at 3:57 PM Alfred Victor > >> > > <alvic...@gmail.com <mailto:alvic...@gmail.com> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com>> > >> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com>>> > >> > > > <mailto:alvic...@gmail.com > <mailto:alvic...@gmail.com> > >> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com>> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com> > >> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com>>>> > >> > > > > <mailto:alvic...@gmail.com > <mailto:alvic...@gmail.com> > >> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com>> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com> > >> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com>>> > >> > > <mailto:alvic...@gmail.com > <mailto:alvic...@gmail.com> <mailto:alvic...@gmail.com > <mailto:alvic...@gmail.com>> > >> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com> > <mailto:alvic...@gmail.com <mailto:alvic...@gmail.com>>>>>> wrote: > >> > > > > > >> > > > > Hi FreeIPA list, > >> > > > > > >> > > > > I don't see any in error log that > match `grep > >> > -i "err=3" > >> > > > > /var/log/httpd/error_log`. We have tried > >> raising > >> > > > searchtimelimit as > >> > > > > high as 120, then 300 (now are trying > 600) but > >> > > observed no > >> > > > > difference in the rate at which nodes > >> succeeded or > >> > > failed in IPA > >> > > > > joins. We are somewhat puzzled by > this, as > >> none of > >> > > the other > >> > > > values > >> > > > > we are aware of might have changed, > though it > >> is > >> > > possible that the > >> > > > > IPA systems are under a little higher > demand > >> > from client > >> > > > systems, we > >> > > > > have tried to mitigate this by > shutting down > >> some > >> > > workflows and > >> > > > > aren't sure whether we've seen any > improvement. > >> > > Short of adjusting > >> > > > > apache/resource/process timeouts it is > >> > difficult to > >> > > say what might > >> > > > > be wrong. To give an example, out of > 250 nodes > >> > > rebooted, only 112 > >> > > > > joined IPA successfully. Here is some > output > >> from > >> > > the error log, > >> > > > > following what this looks like in the ipa > >> client > >> > > install log > >> > > > (error > >> > > > > log output will match the node attempt): > >> > > > > > >> > > > > > >> > > > > 2021-06-07T18:25:30Z DEBUG The > >> ipa-client-install > >> > > command > >> > > > failed, exception: NetworkError: cannot > connect to > >> > > > 'https://redactednode.com/ipa/json > >> > > > <https://hauth0004.dug.com/ipa/json>': Internal > >> > Server Error > >> > > > > > >> > > > > [Mon Jun 07 13:25:06.198259 2021] > [core:error] > >> > [pid > >> > > 25020] > >> > > > [client 10.1.24.48:47808 > <http://10.1.24.48:47808> <http://10.1.24.48:47808> > >> > <http://10.1.24.48:47808> > >> > > <http://10.1.24.48:47808> > >> > > > <http://10.1.24.48:47808>] Script timed out > before > >> > returning > >> > > > headers: wsgi.py, referer: > >> > > https://redacted.redacted.com/ipa/xml > >> > > > <https://hauth0004.dug.com/ipa/xml> > >> > > > > > >> > > > > Different node, same time period: > >> > > > > > >> > > > > > >> > > > > [Mon Jun 07 13:24:02.178092 2021] > [:error] [pid > >> > > 25725] ipa: > >> > > > INFO: [xmlserver] mach_j...@redacted.com: > >> > > join(u'redacted.node.com > <http://redacted.node.com> <http://redacted.node.com> > >> > <http://redacted.node.com> > >> > > > <http://redacted.node.com> > <http://redacted.node.com > >> >', > >> > > > nshardwareplatform=u'x86_64', > >> > > > > >> nsosversion=u'3.10.0-1062.18.1.1.el7.redacted.x86_64', > >> > > > version=u'2.51'): TimeLimitExceeded > >> > > > > > >> > > > > I also saw this: > >> > > > > > >> > > > > [Mon Jun 07 13:25:07.103503 2021] > [:error] [pid > >> > > 25725] ipa: > >> > > > ERROR: non-public: IOError: request data > read error > >> > > > > [Mon Jun 07 13:25:07.103529 2021] > [:error] > >> > [pid 25725] > >> > > > Traceback (most recent call last): > >> > > > > [Mon Jun 07 13:25:07.103536 2021] > [:error] [pid > >> > > 25725] File > >> > > > > >> > "/usr/lib/python2.7/site-packages/ipaserver/rpcserver.py", > >> > > line 360, > >> > > > in wsgi_execute > >> > > > > [Mon Jun 07 13:25:07.103542 2021] > [:error] [pid > >> > > 25725] > >> > > > data = read_input(environ) > >> > > > > [Mon Jun 07 13:25:07.103548 2021] > [:error] [pid > >> > > 25725] File > >> > > > > >> > "/usr/lib/python2.7/site-packages/ipaserver/rpcserver.py", > >> > > line 200, > >> > > > in read_input > >> > > > > [Mon Jun 07 13:25:07.103553 2021] > [:error] [pid > >> > > 25725] > >> > > > return > >> > environ['wsgi.input'].read(length).decode('utf-8') > >> > > > > [Mon Jun 07 13:25:07.103559 2021] > [:error] > >> > [pid 25725] > >> > > > IOError: request data read error > >> > > > > [Mon Jun 07 13:25:07.103826 2021] > [:error] [pid > >> > > 25725] ipa: > >> > > > INFO: [xmlserver] mach_j...@redacted.com: None: > >> > InternalError > >> > > > > [Mon Jun 07 13:25:07.149962 2021] > [:error] [pid > >> > > 25726] ipa: > >> > > > ERROR: non-public: IOError: request data > read error > >> > > > > [Mon Jun 07 13:25:07.149984 2021] > [:error] > >> > [pid 25726] > >> > > > Traceback (most recent call last): > >> > > > > [Mon Jun 07 13:25:07.149991 2021] > [:error] [pid > >> > > 25726] File > >> > > > > >> > "/usr/lib/python2.7/site-packages/ipaserver/rpcserver.py", > >> > > line 360, > >> > > > in wsgi_execute > >> > > > > [Mon Jun 07 13:25:07.149997 2021] > [:error] [pid > >> > > 25726] > >> > > > data = read_input(environ) > >> > > > > [Mon Jun 07 13:25:07.150002 2021] > [:error] [pid > >> > > 25726] File > >> > > > > >> > "/usr/lib/python2.7/site-packages/ipaserver/rpcserver.py", > >> > > line 200, > >> > > > in read_input > >> > > > > [Mon Jun 07 13:25:07.150008 2021] > [:error] [pid > >> > > 25726] > >> > > > return > >> > environ['wsgi.input'].read(length).decode('utf-8') > >> > > > > [Mon Jun 07 13:25:07.150013 2021] > [:error] > >> > [pid 25726] > >> > > > IOError: request data read error > >> > > > > > >> > > > > > >> > > > > > >> > > > > After setting the timeout to 600 and > rebooting > >> the > >> > > remaining > >> > > > 139 nodes from the initial set of 250, 83 > joined of > >> the > >> > > 139 and we > >> > > > still had ISE occurring. In some cases, it > would ISE > >> on > >> > > the first > >> > > > attempt, try another IPA system, and > succeed. I'm > >> > not sure > >> > > that even > >> > > > such a long timeout as 600 has helped. > >> > > > > > >> > > > > Alfred > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > On Thu, Jun 3, 2021 at 7:51 PM Rob > Crittenden > >> > > > <rcrit...@redhat.com > <mailto:rcrit...@redhat.com> <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com>> > >> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com>>> > >> > > <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com> <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com>> > >> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com>>>> > >> > > > > <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com> > >> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com>> > >> > > <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com> <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com>>> > >> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com> > <mailto:rcrit...@redhat.com <mailto:rcrit...@redhat.com>> > >> > > <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com> > >> > <mailto:rcrit...@redhat.com > <mailto:rcrit...@redhat.com>>>>>> wrote: > >> > > > > > >> > > > > Alfred Victor via FreeIPA-users > wrote: > >> > > > > > Hi FreeIPA list, > >> > > > > > > >> > > > > > We are having an issue with our IPA > >> > > environment of 4 > >> > > > > replicated FreeIPA > >> > > > > > systems serving linux compute > clients > >> > > which join from a > >> > > > command in > >> > > > > > rc.local after boot. This > worked in the > >> > past, > >> > > but the system > >> > > > > has been > >> > > > > > rebuilt since and the join command > >> changed > >> > > > slightly. Unfortunately > >> > > > > > booting a few dozen nodes at a > time, > >> though > >> > > they each > >> > > > talk to a > >> > > > > > different IPA system by design, > leads to > >> > > problems such as > >> > > > > these - though > >> > > > > > 40-100 nodes can boot ok at a > time there > >> are > >> > > always many > >> > > > > stragglers, and > >> > > > > > the more we attempt to boot at > once the > >> more > >> > > fail to > >> > > > join IPA > >> > > > > (if we try > >> > > > > > to boot 500 nodes, we are lucky > if we > >> get a > >> > > fifth of that > >> > > > > joining IPA). > >> > > > > > Can you please advise on this > output? > >> > Here is > >> > > our join > >> > > > command in > >> > > > > > compute node rc.local: > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > ipa-client-install -U -q -p > >> mach_join \ > >> > > > > > -w <redacted> \ > >> > > > > > --force-join \ > >> > > > > > --no-dns-sshfp \ > >> > > > > > > --automount-location=redacted-node > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > And here is some log output of > the 500 > >> > error: > >> > > > > > > >> > > > > > ProtocolError: > <ProtocolError for > >> > > > > redacted.redacted.com/ipa/json > <http://redacted.redacted.com/ipa/json> > >> > <http://redacted.redacted.com/ipa/json> > >> > > <http://redacted.redacted.com/ipa/json> > >> > > > <http://redacted.redacted.com/ipa/json> > >> > > > > > <http://redacted.redacted.com/ipa/json> > >> > > > > > <http://redacted.redacted.com/ipa/json>: > >> 500 > >> > > Internal > >> > > > Server Error> > >> > > > > > Cannot connect to the > server due to > >> > > generic error: > >> > > > cannot > >> > > > > connect to > >> > 'https://redacted.redacted.com/ipa/json > >> > > > > > <https://hauth0003.dug.com/ipa/json>': > >> > Internal > >> > > Server Error > >> > > > > > > >> > > > > > > >> > > > > > As well as: > >> > > > > > > >> > > > > > 2021-06-02T21:39:11Z DEBUG > Starting > >> > > external process > >> > > > > > 2021-06-02T21:39:11Z DEBUG > >> > > args=/usr/sbin/ipa-join -s > >> > > > > > redacted.redacted.com > <http://redacted.redacted.com> > >> > <http://redacted.redacted.com> > >> > > <http://redacted.redacted.com> < > >> http://redacted.redacted.com> > >> > > > <http://redacted.redacted.com> > >> > > > > <http://redacted.redacted.com> -b > >> > > > > > dc=redacted,dc=com -h > >> > > redactednode.redacted.com > <http://redactednode.redacted.com> > >> > <http://redactednode.redacted.com> > <http://redactednode.redacted.com > >> > > >> > > > <http://redactednode.redacted.com> > >> > > > > <http://redactednode.redacted.com> > >> > > > > > > <http://redactednode.redacted.com> > >> -f > >> > > > > > 2021-06-02T21:40:13Z DEBUG > Process > >> > > finished, return > >> > > > code=17 > >> > > > > > 2021-06-02T21:40:13Z DEBUG > stdout= > >> > > > > > 2021-06-02T21:40:13Z DEBUG > >> stderr=HTTP > >> > > response code is > >> > > > > 500, not 200 > >> > > > > > 2021-06-02T21:40:13Z ERROR > Joining > >> realm > >> > > failed: HTTP > >> > > > > response code > >> > > > > > is 500, not 200 > >> > > > > > > >> > > > > > And we also see timeouts happen: > >> > > > > > > >> > > > > > > >> > > > > > 2021-06-02T22:08:50Z DEBUG > >> > > args=/usr/sbin/ipa-join -s > >> > > > > > redacted.redacted.com > <http://redacted.redacted.com> > >> > <http://redacted.redacted.com> > >> > > <http://redacted.redacted.com> < > >> http://redacted.redacted.com> > >> > > > <http://redacted.redacted.com> > >> > > > > <http://redacted.redacted.com> -b > >> > > > > > dc=redacted,dc=com -h > >> > > redactednode.redacted.com > <http://redactednode.redacted.com> > >> > <http://redactednode.redacted.com> > <http://redactednode.redacted.com > >> > > >> > > > <http://redactednode.redacted.com> > >> > > > > <http://redactednode.redacted.com> > >> > > > > > > <http://redactednode.redacted.com> > >> -f > >> > > > > > 2021-06-02T22:09:01Z DEBUG > Process > >> > > finished, return > >> > > > code=17 > >> > > > > > 2021-06-02T22:09:01Z DEBUG > stdout= > >> > > > > > 2021-06-02T22:09:01Z DEBUG > stderr=RPC > >> > > failed at server. > >> > > > > Configured > >> > > > > > time limit exceeded > >> > > > > > 2021-06-02T22:09:01Z ERROR > Joining > >> realm > >> > > failed: RPC > >> > > > failed at > >> > > > > > server. Configured time limit > >> exceeded > >> > > > > > > >> > > > > > > >> > > > > > And we also see later timeouts > near the > >> > end of > >> > > the log > >> > > > in some > >> > > > �� > cases though are able to > authenticate and > >> it > >> > > didn't back > >> > > > out the > >> > > > > install, but never got going healthy > >> either: > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > 2021-06-03T19:20:13Z DEBUG The > >> > > ipa-client-install > >> > > > command > >> > > > > failed, > >> > > > > > exception: TimeLimitExceeded: > >> Configured > >> > > time limit > >> > > > exceeded > >> > > > > > >> > > > > When you see Internal Error look > to the > >> Apache > >> > > error log > >> > > > on the > >> > > > > server > >> > > > > for more information. > >> > > > > > >> > > > > In this case an LDAP search is > failing > >> because > >> > > the server > >> > > > is too > >> > > > > busy. > >> > > > > Look for queries failing with > err=3 to get > >> an > >> > > idea of how long > >> > > > > it is taking. > >> > > > > > >> > > > > To increase the timeout use: ipa > config-mod > >> > > > --searchtimelimit=INT > >> > > > > > >> > > > > The default is 2 seconds. > >> > > > > > >> > > > > You can pick a time at random but > could see > >> > > failures again. > >> > > > > > >> > > > > rob > >> > > > > > >> > > > > >> > > > >> > > >> > >> > > > > > -- > / Alexander Bokovoy > Sr. Principal Software Engineer > Security / Identity Management Engineering > Red Hat Limited, Finland > _______________________________________________ FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahosted.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure