Hi Roey, Thanks for the tip. I have made the change according to your suggestion and fired off tests for overnight test. Will let you know in the morning if this fixes the issue.
Thanks -Sukhdev On Mon, Mar 10, 2014 at 4:17 PM, Roey Chen <ro...@mellanox.com> wrote: > Hi, > > Hope this could help, > > I've encountered this issue myself not to long ago on Ubuntu 12.04 host, > it didn't happen again after messing with the Kernel Semaphore Limits > parameters [1]: > > Adding this [2] line to `/etc/sysctl.conf` seems to do the trick. > > > - Roey > > > [1] http://paste.openstack.org/show/73086/ > [2] http://paste.openstack.org/show/73082/ > > > > ------------------------------ > *From:* Dane Leblanc (leblancd) [lebla...@cisco.com] > *Sent:* Monday, March 10, 2014 11:54 PM > *To:* Sukhdev Kapur; Sean Dague; John Griffith > *Cc:* openstack-infra@lists.openstack.org > > *Subject:* Re: [OpenStack-Infra] tgt restart fails in Cinder startup > "start: job failed to start" > > Sean, John: > > > > I’ve had a similar experience as Sukhdev… I had tried doing clean.sh on > every run, but that didn’t help prevent the tgt problem, and it doesn’t > help recover from it. > > Sounds like the best option is to reset the VM for each run. > > > > Thanks, > > Dane > > > > *From:* Sukhdev Kapur [mailto:sukhdevka...@gmail.com] > *Sent:* Monday, March 10, 2014 4:33 PM > *To:* Sean Dague > *Cc:* Dane Leblanc (leblancd); openstack-infra@lists.openstack.org > *Subject:* Re: [OpenStack-Infra] tgt restart fails in Cinder startup > "start: job failed to start" > > > > Hi Sean, > > > > In my case, for every run, I do unstack.sh, clean.sh, sudo rm -rf > devstack, sudo rm -rf /opt/stack. > > Then I go get everything fresh and stack.sh, and a full run of smoke tests > > Few iterations of this sequence will get you into this condition. Once in > this condition - clean.sh and unstack.sh, nothing helps, it fails solid > 100% of times. If reboot the VM, everything works just fine for next 10-20 > cycles until it hits the same condition. So, I am planning on modifying the > script to reboot the VM every two hours or so....as a work around....but, > the underlying problem occurred close to Ichouse check-ins. I started to > notice this few days earlier than the Icehouse deadline, prior to that I > was running the same sequence without any issue (for several weeks) - if > that helps any... > > > > -Sukhdev > > > > > > On Mon, Mar 10, 2014 at 1:07 PM, Sean Dague <s...@dague.net> wrote: > > So, honestly, running stack.sh / unstack.sh that many times in a row > really isn't expected to work in my experience. You should at minimum be > doing ./clean.sh to try to reset the state further. > > -Sean > > > On 03/10/2014 03:00 PM, Dane Leblanc (leblancd) wrote: > > In my case, the base OS is 12.04 Precise. > > > > The problem is intermittent in that it takes maybe 15 to 20 cycles of > unstack/stack to get it into the failure mode, but once in the failure > mode, it appears that tgt daemon is 100% dead-in-the-water. > > > > -----Original Message----- > > From: Sean Dague [mailto:s...@dague.net] > > Sent: Monday, March 10, 2014 1:49 PM > > To: Dane Leblanc (leblancd); openstack-infra@lists.openstack.org > > Subject: Re: [OpenStack-Infra] tgt restart fails in Cinder startup > "start: job failed to start" > > > > What base OS? A change was made there recently to better handle debian > because we believed (possibly incorrectly) that precise actually had > working init scripts. > > > > It would be interesting to understand if this was a 100% failure, or > only intermittent, and what base OS it was on. > > > > -Sean > > > > On 03/10/2014 11:37 AM, Dane Leblanc (leblancd) wrote: > >> I don't know if anyone can give me some troubleshooting advice with > this issue. > >> > >> I'm seeing an occasional problem whereby after several DevStack > unstack.sh/stack.sh cycles, the tgt daemon (tgtd) fails to start during > Cinder startup. Here's a snippet from the stack.sh log: > >> > >> 2014-03-10 07:09:45.214 | Starting Cinder > >> 2014-03-10 07:09:45.215 | + return 0 > >> 2014-03-10 07:09:45.216 | + sudo rm -f /etc/tgt/conf.d/stack.conf > >> 2014-03-10 07:09:45.217 | + _configure_tgt_for_config_d > >> 2014-03-10 07:09:45.218 | + [[ ! -d /etc/tgt/stack.d/ ]] > >> 2014-03-10 07:09:45.219 | + is_ubuntu > >> 2014-03-10 07:09:45.220 | + [[ -z deb ]] > >> 2014-03-10 07:09:45.221 | + '[' deb = deb ']' > >> 2014-03-10 07:09:45.222 | + sudo service tgt restart > >> 2014-03-10 07:09:45.223 | stop: Unknown instance: > >> 2014-03-10 07:09:45.619 | start: Job failed to start > >> jenkins@neutronpluginsci:~/devstack$ 2014-03-10 07:09:45.621 | + > >> exit_trap > >> 2014-03-10 07:09:45.622 | + local r=1 > >> 2014-03-10 07:09:45.623 | ++ jobs -p > >> 2014-03-10 07:09:45.624 | + jobs= > >> 2014-03-10 07:09:45.625 | + [[ -n '' ]] > >> 2014-03-10 07:09:45.626 | + exit 1 > >> > >> If I try to restart tgt manually without success: > >> > >> jenkins@neutronpluginsci:~$ sudo service tgt restart > >> stop: Unknown instance: > >> start: Job failed to start > >> jenkins@neutronpluginsci:~$ sudo tgtd > >> librdmacm: couldn't read ABI version. > >> librdmacm: assuming: 4 > >> CMA: unable to get RDMA device list > >> (null): iser_ib_init(3263) Failed to initialize RDMA; load kernel > modules? > >> (null): fcoe_init(214) (null) > >> (null): fcoe_create_interface(171) no interface specified. > >> jenkins@neutronpluginsci:~$ > >> > >> The config in /etc/tgt is: > >> > >> jenkins@neutronpluginsci:/etc/tgt$ ls -l total 8 drwxr-xr-x 2 root > >> root 4096 Mar 10 07:03 conf.d > >> lrwxrwxrwx 1 root root 30 Mar 10 06:50 stack.d -> > /opt/stack/data/cinder/volumes > >> -rw-r--r-- 1 root root 58 Mar 10 07:07 targets.conf > >> jenkins@neutronpluginsci:/etc/tgt$ cat targets.conf include > >> /etc/tgt/conf.d/*.conf include /etc/tgt/stack.d/* > >> jenkins@neutronpluginsci:/etc/tgt$ ls conf.d > >> jenkins@neutronpluginsci:/etc/tgt$ ls /opt/stack/data/cinder/volumes > >> jenkins@neutronpluginsci:/etc/tgt$ > >> > >> I don't know if there's any missing Cinder config in my DevStack > localrc files. Here's one that I'm using: > >> > >> MYSQL_PASSWORD=nova > >> RABBIT_PASSWORD=nova > >> SERVICE_TOKEN=nova > >> SERVICE_PASSWORD=nova > >> ADMIN_PASSWORD=nova > >> ENABLED_SERVICES=g-api,g-reg,key,n-api,n-crt,n-obj,n-cpu,n-cond,cinder > >> ,c-sch,c-api,c-vol,n-sch,n-novnc,n-xvnc,n-cauth,horizon,rabbit > >> enable_service mysql > >> disable_service n-net > >> enable_service q-svc > >> enable_service q-agt > >> enable_service q-l3 > >> enable_service q-dhcp > >> enable_service q-meta > >> enable_service q-lbaas > >> enable_service neutron > >> enable_service tempest > >> VOLUME_BACKING_FILE_SIZE=2052M > >> Q_PLUGIN=cisco > >> declare -a Q_CISCO_PLUGIN_SUBPLUGINS=(openvswitch nexus) declare -A > >> Q_CISCO_PLUGIN_SWITCH_INFO=([10.0.100.243]=admin:Cisco12345:22:neutron > >> pluginsci:1/9) > >> NCCLIENT_REPO=git://github.com/CiscoSystems/ncclient.git > >> PHYSICAL_NETWORK=physnet1 > >> OVS_PHYSICAL_BRIDGE=br-eth1 > >> TENANT_VLAN_RANGE=810:819 > >> ENABLE_TENANT_VLANS=True > >> API_RATE_LIMIT=False > >> VERBOSE=True > >> DEBUG=True > >> LOGFILE=/opt/stack/logs/stack.sh.log > >> USE_SCREEN=True > >> SCREEN_LOGDIR=/opt/stack/logs > >> > >> Here are links to a log showing another localrc file that I use, and > the corresponding stack.sh log: > >> > >> http://128.107.233.28:8080/job/neutron/1390/artifact/vpnaas_console_lo > >> g.txt > >> http://128.107.233.28:8080/job/neutron/1390/artifact/vpnaas_stack_sh_l > >> og.txt > >> > >> Does anyone have any advice on how to debug this, or recover from this > (beyond rebooting the node)? Or am I missing any Cinder config? > >> > >> Thanks in advance for any help on this!!! > >> Dane > >> > >> > >> > >> _______________________________________________ > >> OpenStack-Infra mailing list > >> OpenStack-Infra@lists.openstack.org > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra > >> > > > > > > -- > > Sean Dague > > Samsung Research America > > s...@dague.net / sean.da...@samsung.com > > http://dague.net > > > > > -- > Sean Dague > Samsung Research America > s...@dague.net / sean.da...@samsung.com > http://dague.net > > > _______________________________________________ > OpenStack-Infra mailing list > OpenStack-Infra@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra > > >
_______________________________________________ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra