On Feb 26, 2014, at 03:37 PM, Bill Filler wrote: >I'm looking at this with the help of Omer. It's very strange as the >failure is simply trying to switch from the main view to the Albums tab >(via the ubuntu-ui-toolkit-emulator classes) and that operation does not >succeed. It's getting a time out from dbus. This issue seems to occur >in other smoketest failures as well, like this one >http://ci.ubuntu.com/smokeng/trusty/touch/mako/209:20140226.1:20140224/6842/gallery_app/818684/. > >The test seems to be written correctly and I can't reproduce it on a >device despite running it multiple times. Omer (or other AP experts) are >needed here for the next step of evaluation.
This looks suspiciously like timeouts I see when running the system-image test suite. I don't know how your tests are set up or under what environment they're run, but I'm pretty well convinced there's some unaccounted for flakiness in D-Bus in some test environments. Actually, I think there are two general D-Bus problems. * dbus-daemon SIGHUP race conditions. In my system-image test suite, I start up a dbus-daemon with some custom system bus services. My service config files change depending on the test being run, but you cannot kill and restart dbus-daemon when this happens because libdbus only reads its private-bus environment variables once when the library initializes. The solution that works is to write the new config files, then SIGHUP dbus-daemon, which tells it to re-read its config files. Alternatively, you can call the ReloadConfig() on the org.freedesktop.DBus interface's / object. On a sufficiently beefy desktop box, this is quite reliable. Not so much on the buildds. Clearly there is a race condition even when ReloadConfig() is used. Even if the new configs are already in place when you SIGHUP/ReloadConfig(), dbus-daemon will sometimes complain that there is no .service file available for the service you're trying to D-Bus activate. I can see why SIGHUP would be racy, and my guess is that ReloadConfig() boils down to just SIGHUPing rather than doing something more sane, like synchronously reloading the configs and not returning until all of its data structures are up-to-date. I've taken to programming defensively around this one: OVERRIDE = os.environ.get('SYSTEMIMAGE_DBUS_DAEMON_HUP_SLEEP_SECONDS') HUP_SLEEP = (0 if OVERRIDE is None else int(OVERRIDE)) ... def blah(): service = dbus.SystemBus().get_object('org.freedesktop.DBus', '/') iface = dbus.Interface(service, 'org.freedesktop.DBus') iface.ReloadConfig() time.sleep(HUP_SLEEP) When developing on my desktop, this calls time.sleep(0) which as I say always works on my development machine. Then in my d/rules I set SYSTEMIMAGE_DBUS_DAEMON_HUP_SLEEP_SECONDS to 2 so that on the buildd's there's a short blocking delay before continuing on to D-Bus activation. 2 seconds is the Goldilocks value, 1 is definitely too short. So far, with this change I've been able to very reliably avoid this particular problem on the buildds. * Random D-Bus timeouts. This one is tougher, and more probably similar to what you're seeing. Just every once in a while I get random D-Bus timeouts in response to some messages. I've seen this on my desktop and laptop, and in PPAs and archive buildds, even when the systems do not seem to be overloaded. There's no pattern that I can see related to *which* methods timeout - sometimes they break a test, other times they don't. Sometimes they are messages between system-image-dbus and its client, and other times they are between ubuntu-download-manager and system-image-dbus. I have no clue what's causing them, but fortunately(?), they're rare-ish for me. The only workaround I've come up with is to retry the test when developing locally, or to retry the build. Usually the second or even more rarely third time, it will Just Work. I'd love to know more about what's going on here, but I suspect we're both seeing symptoms of the same problem. Cheers, -Barry -- Mailing list: https://launchpad.net/~ubuntu-phone Post to : ubuntu-phone@lists.launchpad.net Unsubscribe : https://launchpad.net/~ubuntu-phone More help : https://help.launchpad.net/ListHelp