Excerpts from Beber's message of Sat Oct 22 17:42:55 UTC 2011: > Public bug reported: > > I'm using Juju rev.409 inside a VirtualBox machine running Oneiric. > I only use the local provider. > > I'm able to bootstrap the environment and create a first unit. > But, I never managed to create a second unit (neither another unit from the > same service nor a unit from a different service). > > The machine-agent.log file showed lots of errors related to the connection > with zookeeper : > 2011-10-22 > 13:46:25,280:5297(0xb6f8bb70):ZOO_ERROR@handle_socket_error_msg@1528: Socket > [192.168.122.1:55580] zk retcode=-7, errno=110(Connection timed out): > connection timed out (exceeded timeout by 24ms) > 2011-10-22 13:46:26,173:5297(0xb6f8bb70):ZOO_WARN@zookeeper_interest@1461: > Exceeded deadline by 1719ms > 2011-10-22 13:46:29,512:5297(0xb6f8bb70):ZOO_WARN@zookeeper_interest@1461: > Exceeded deadline by 5056ms > 2011-10-22 13:46:30,486:5297(0xb6f8bb70):ZOO_INFO@check_events@1585: > initiated connection to server [192.168.122.1:55580] > 2011-10-22 > 13:46:36,240:5297(0xb6f8bb70):ZOO_ERROR@handle_socket_error_msg@1528: Socket > [192.168.122.1:55580] zk retcode=-7, errno=110(Connection timed out): > connection timed out (exceeded timeout by 1ms) > 2011-10-22 13:46:36,431:5297(0xb6f8bb70):ZOO_WARN@zookeeper_interest@1461: > Exceeded deadline by 253ms > 2011-10-22 13:46:39,986:5297(0xb6f8bb70):ZOO_WARN@zookeeper_interest@1461: > Exceeded deadline by 3768ms > 2011-10-22 13:46:40,084:5297(0xb6f8bb70):ZOO_INFO@check_events@1585: > initiated connection to server [192.168.122.1:55580] > 2011-10-22 > 13:46:41,146:5297(0xb6f8bb70):ZOO_ERROR@handle_socket_error_msg@1621: Socket > [192.168.122.1:55580] zk retcode=-112, errno=116(Stale NFS file handle): > sessionId=0x1332b4783a40001 has expired. > > These errors occured during the instanciation of the first unit. After > the first unit was created, no new lines were appended to machine- > agent.log. > > I changed the tickTime value from 2000 to 20000 in > /usr/share/pyshared/juju/lib/zk.py and my problem has gone : > - no more error messages > - I can now create another unit after the first one > > The bahaviour I abserved should not occur on real hardware, but it may > be useful to be able to set this parameter in the environments.yaml > file. > > Also, it would be nice if the machine agent could reconnect to zookeeper > in case the connection is lost for a moment. > > ** Affects: juju (Ubuntu) > Importance: Undecided > Status: New > > ** Project changed: juju => juju (Ubuntu) >
There is in progress work to properly handle session expiration and reconnect in agents, we can increase the default session timeout as well. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/880023 Title: machine agent disconnects from zoopkeeper on heavy loads To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/juju/+bug/880023/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs