[ https://issues.apache.org/jira/browse/CLOUDSTACK-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725222#comment-13725222 ]
Wei Zhou commented on CLOUDSTACK-3947: -------------------------------------- Thanks my colleagues Shashi Dahal and Mukesh Kumar reproducing this issue. There is a sentence "sleep 2" in /root/reconfigLB.sh in the virtual routers ( see below) If a command pause at "sleep 2" while another command come in, this issue will appear. CloudStack has the mechanism to prevent concurrent operations on the same network/ipaddress/vpc by overriding getSyncObjType and getSyncObjId methods. However, these two methods is not overrided in CreateLBStickinessPolicyCmd. I will commit a patch for this issue. I will fix similar issues later. ------------------------------------------/root/reconfigLB.sh in VR------------------------------------------ ret=0 # save previous state mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.old mv /var/run/haproxy.pid /var/run/haproxy.pid.old mv /etc/haproxy/haproxy.cfg.new /etc/haproxy/haproxy.cfg kill -TTOU $(cat /var/run/haproxy.pid.old) sleep 2 if haproxy -D -p /var/run/haproxy.pid -f /etc/haproxy/haproxy.cfg; then logger -t cloud "New haproxy instance successfully loaded, stopping previous one." kill -KILL $(cat /var/run/haproxy.pid.old) rm -f /var/run/haproxy.pid.old ret=0 else logger -t cloud "New instance failed to start, resuming previous one." kill -TTIN $(cat /var/run/haproxy.pid.old) rm -f /var/run/haproxy.pid mv /var/run/haproxy.pid.old /var/run/haproxy.pid mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.new mv /etc/haproxy/haproxy.cfg.old /etc/haproxy/haproxy.cfg ret=1 fi exit $ret ------------------------------------------------------------------------------------ > haproxy is running but haproxy.pid file is missing in virtual router > -------------------------------------------------------------------- > > Key: CLOUDSTACK-3947 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3947 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Affects Versions: 4.0.2 > Environment: cloudstack 4.0.2 > Reporter: Wei Zhou > Assignee: Wei Zhou > Priority: Critical > > I think this issue exists on 4.1/4.2 as well. > agent.logs > 2013-07-30 11:01:01,546 DEBUG > [resource.virtualnetwork.VirtualRoutingResource] > (agentRequest-Handler-5:null) + iflag=+ aflag=+ dflag=+ fflag=+ sflag=+ > getopts i:a:d:f:s: OPTION+ case $OPTION in+ iflag=1+ domRIp=169.254.1.253+ > getopts i:a:d:f:s: OPTION+ case $OPTION in+ fflag=1+ > cfgfile=/tmp/169_254_1_2537137377844466896245cfg+ getopts i:a:d:f:s: OPTION+ > case $OPTION in+ aflag=1+ addedIps=10.11.102.174:111:,+ getopts i:a:d:f:s: > OPTION+ case $OPTION in+ sflag=1+ statsIps=10.11.102.174:8081:0/0:,,+ getopts > i:a:d:f:s: OPTION+ cert=/root/.ssh/id_rsa.cloud+ '[' 11 '!=' 11 ']'+ check_gw > 169.254.1.253+ ping -c 1 -n -q 169.254.1.253+ '[' 0 -gt 0 ']'+ return 0+ '[' > 0 -gt 0 ']'+ copy_haproxy 169.254.1.253 > /tmp/169_254_1_2537137377844466896245cfg+ local domRIp=169.254.1.253+ local > cfg=/tmp/169_254_1_2537137377844466896245cfg+ scp -P 3922 -q -o > StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud > /tmp/169_254_1_2537137377844466896245cfg > root@169.254.1.253:/etc/haproxy/haproxy.cfg.new+ return 0+ '[' 0 -gt 0 ']'+ > ssh -p 3922 -q -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud > root@169.254.1.253 '/root/loadbalancer.sh -i 169.254.1.253 -f > /tmp/169_254_1_2537137377844466896245cfg -a 10.11.102.174:111:, -s > 10.11.102.174:8081:0/0:,,'mv: cannot stat `/var/run/haproxy.pid': No such > file or directory[ALERT] 210/090100 (9007) : Starting proxy stats_on_public: > cannot bind socket[ALERT] 210/090100 (9007) : Starting proxy > 10_11_102_174-111: cannot bind socketcat: /var/run/haproxy.pid.old: No such > file or directorykill: usage: kill [-s sigspec | -n signum | -sigspec] pid | > jobspec ... or kill -l [sigspec]mv: cannot stat `/var/run/haproxy.pid.old': > No such file or directory+ exit 1 > 2013-07-30 11:01:01,546 DEBUG [cloud.agent.Agent] > (agentRequest-Handler-5:null) Seq 6-1836847202: { Ans: , MgmtId: > 345051509349, via: 6, Ver: v1, Flags: 0, > [{"Answer":{"result":false,"details":"+ iflag=+ aflag=+ dflag=+ fflag=+ > sflag=+ getopts i:a:d:f:s: OPTION+ case $OPTION in+ iflag=1+ > domRIp=169.254.1.253+ getopts i:a:d:f:s: OPTION+ case $OPTION in+ fflag=1+ > cfgfile=/tmp/169_254_1_2537137377844466896245cfg+ getopts i:a:d:f:s: OPTION+ > case $OPTION in+ aflag=1+ addedIps=10.11.102.174:111:,+ getopts i:a:d:f:s: > OPTION+ case $OPTION in+ sflag=1+ statsIps=10.11.102.174:8081:0/0:,,+ getopts > i:a:d:f:s: OPTION+ cert=/root/.ssh/id_rsa.cloud+ '[' 11 '!=' 11 ']'+ check_gw > 169.254.1.253+ ping -c 1 -n -q 169.254.1.253+ '[' 0 -gt 0 ']'+ return 0+ '[' > 0 -gt 0 ']'+ copy_haproxy 169.254.1.253 > /tmp/169_254_1_2537137377844466896245cfg+ local domRIp=169.254.1.253+ local > cfg=/tmp/169_254_1_2537137377844466896245cfg+ scp -P 3922 -q -o > StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud > /tmp/169_254_1_2537137377844466896245cfg > root@169.254.1.253:/etc/haproxy/haproxy.cfg.new+ return 0+ '[' 0 -gt 0 ']'+ > ssh -p 3922 -q -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud > root@169.254.1.253 '/root/loadbalancer.sh -i 169.254.1.253 -f > /tmp/169_254_1_2537137377844466896245cfg -a 10.11.102.174:111:, -s > 10.11.102.174:8081:0/0:,,'mv: cannot stat `/var/run/haproxy.pid': No such > file or directory[ALERT] 210/090100 (9007) : Starting proxy stats_on_public: > cannot bind socket[ALERT] 210/090100 (9007) : Starting proxy > 10_11_102_174-111: cannot bind socketcat: /var/run/haproxy.pid.old: No such > file or directorykill: usage: kill [-s sigspec | -n signum | -sigspec] pid | > jobspec ... or kill -l [sigspec]mv: cannot stat `/var/run/haproxy.pid.old': > No such file or directory+ exit 1","wait":0}}] } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira