Yes, I have created custom iso with debug output. It didn't help, so another one with strace was created. On Jan 27, 2016 00:56, "Alex Schultz" <aschu...@mirantis.com> wrote:
> On Tue, Jan 26, 2016 at 2:16 PM, Stanislaw Bogatkin > <sbogat...@mirantis.com> wrote: > > When there is too high strata, ntpdate can understand this and always > write > > this into its log. In our case there are just no log - ntpdate send first > > packet, get an answer - that's all. So, fudging won't save us, as I > think. > > Also, it's a really bad approach to fudge a server which doesn't have a > real > > clock onboard. > > Do you have a debug output of the ntpdate somewhere? I'm not finding > it in the bugs or in some of the snapshots for the failures. I did > find one snapshot with the -v change that didn't have any response > information so maybe it's the other problem where there is some > network connectivity isn't working correctly or the responses are > getting dropped somewhere? > > -Alex > > > > > On Tue, Jan 26, 2016 at 10:41 PM, Alex Schultz <aschu...@mirantis.com> > > wrote: > >> > >> On Tue, Jan 26, 2016 at 11:42 AM, Stanislaw Bogatkin > >> <sbogat...@mirantis.com> wrote: > >> > Hi guys, > >> > > >> > for some time we have a bug [0] with ntpdate. It doesn't reproduced > 100% > >> > of > >> > time, but breaks our BVT and swarm tests. There is no exact point > where > >> > problem root located. To better understand this, some verbosity to > >> > ntpdate > >> > output was added but in logs we can see only that packet exchange > >> > between > >> > ntpdate and server was started and was never completed. > >> > > >> > >> So when I've hit this in my local environments there is usually one or > >> two possible causes for this. 1) lack of network connectivity so ntp > >> server never responds or 2) the stratum is too high. My assumption is > >> that we're running into #2 because of our revert-resume in testing. > >> When we resume, the ntp server on the master may take a while to > >> become stable. This sync in the deployment uses the fuel master for > >> synchronization so if the stratum is too high, it will fail with this > >> lovely useless error. My assumption on what is happening is that > >> because we aren't using a set of internal ntp servers but rather > >> relying on the standard ntp.org pools. So when the master is being > >> resumed it's struggling to find a good enough set of servers so it > >> takes a while to sync. This then causes these deployment tasks to fail > >> because the master has not yet stabilized (might also be geolocation > >> related). We could either address this by fudging the stratum on the > >> master server in the configs or possibly introducing our own more > >> stable local ntp servers. I have a feeling fudging the stratum might > >> be better when we only use the master in our ntp configuration. > >> > >> > As this bug is blocker, I propose to merge [1] to better understanding > >> > what's going on. I created custom ISO with this patchset and tried to > >> > run > >> > about 10 BVT tests on this ISO. Absolutely with no luck. So, if we > will > >> > merge this, we would catch the problem much faster and understand root > >> > cause. > >> > > >> > >> I think we should merge the increased logging patch anyway because > >> it'll be useful in troubleshooting but we also might want to look into > >> getting an ntp peers list added into the snapshot. > >> > >> > I appreciate your answers, folks. > >> > > >> > > >> > [0] https://bugs.launchpad.net/fuel/+bug/1533082 > >> > [1] https://review.openstack.org/#/c/271219/ > >> > -- > >> > with best regards, > >> > Stan. > >> > > >> > >> Thanks, > >> -Alex > >> > >> > __________________________________________________________________________ > >> OpenStack Development Mailing List (not for usage questions) > >> Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > > > > > -- > > with best regards, > > Stan. > > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev