If we have consensus that apt-get update during vm init is a bad idea then this patch might be a good quick solution [0].
Regards, Thanh [0] https://gerrit.fd.io/r/4797 On Thu, Jan 19, 2017 at 10:47 PM, Ed Warnicke <hagb...@gmail.com> wrote: > Thanh, > > I'm not quite sure the logic of having it at that particular point > either. Something to investigate. > > Ed > > On Thu, Jan 19, 2017 at 8:44 PM, Thanh Ha <thanh...@linuxfoundation.org> > wrote: > >> FWIW in OpenDaylight we don't typically run yum update or apt-get update >> in our init-scripts on VM spinup. At the job level we only install >> dependencies needed by the build. I'm not sure why fd.io is running >> upgrades but it was existing in the script when I looked at it. System >> upgrades during VM spinup is not something the OpenDaylight project does at >> least. >> >> Regards, >> Thanh >> >> >> On Thu, Jan 19, 2017 at 10:38 PM, Dave Wallace <dwallac...@gmail.com> >> wrote: >> >>> Ed, Thanh, Vanessa, >>> >>> IMHO, updating the ubuntu packages every time a VM is spun up is a bug >>> wrt. being able to reproduce some (hopefully rare) build/test issues. >>> Since every VM is potentially running with different versions of OS >>> components, when a failure occurs (e.g. in "make test"), then it may be >>> necessary to recreate the exact run-time environment in order to reproduce >>> the failure. Unless the complete package list is being archived for every >>> VM instance that is spun up, this may not be possible. >>> >>> My experience is that those rare cases where a tool or environment issue >>> causes a failure, the cost to find the issue is extraordinarily high if you >>> do not have the ability to recreate the EXACT build/run-time environment. >>> This >>> is why CSIT does not update OS components in the VM initialization scripts >>> and the VM images are built from a specific package list instead of pulling >>> the latest versions from the apt repositories. >>> >>> My recommendation is that the VM images be updated periodically (weekly >>> or whenever a new security update is released) and the package lists >>> archived for each VM image version. Each VM image should also be verified >>> against a known good VPP commit version as is done with CSIT branches. >>> Ideally we should build a fully automated continuous deployment model to >>> reduce the amount of work to update the VM images to running a Jenkins job >>> to build/test/deploy a new VM image from the latest packages versions. >>> >>> With that automation in place, this mechanism could be extended for use >>> by CSIT as well as "make test", thus ensuring that all of our testing was >>> done with the same OS component version. Ideally, all projects should be >>> using the same OS components to ensure that everything is tested in the >>> same run-time environment. >>> >>> Thanks, >>> -daw- >>> >>> On 1/19/2017 8:31 PM, Thanh Ha via RT wrote: >>> >>> The issue with the 16.04 Ubuntu image is fixed now (but we may require some >>> additional actions which I'll send to Vanessa to in case this issue comes >>> up again). We fixed this issue tonight by rebuilding ubuntu1604 and >>> deploying the new image. >>> >>> I'm going to close this ticket as resolved and we'll take the additional >>> task to find a way to ensure this doesn't appear again off of this ticket. >>> >>> If you're not interested in the detailed analysis you can stop reading now. >>> >>> For those interested I suspect that the lock issue will appear again >>> (although I could be wrong). The reason I believe so is that our vm init >>> script runs "apt-get update" as an initialization step when the VM boots up >>> at creation time via this script [0]. Ed mentioned that we didn't see this >>> in the past and it only started appear again recently as we deployed >>> another patch to disable Ubuntu's unattended updates. >>> >>> I believe a possible reason we will see this issue appear again due to [0] >>> is because of we switched from using JClouds to OpenStack Jenkins plugins >>> for node spinnup and there's difference in how the init-script is executed >>> depending on which plugin is being used. >>> >>> JClouds Plugin: >>> >>> 1) boot vm >>> 2) wait for ssh access >>> 3) copies init-script into vm via ssh >>> 4) executes init-script, and doesn't continue processing until script is >>> complete >>> 5) once init-script is complete, passes vm over to job and job starts >>> >>> OpenStack Plugin: >>> >>> 1) boot vm and passes init-script in as User Data >>> 2) init-script runs inside vm without Jenkins intervention, thus is a >>> non-blocking function >>> 3) in parallel jenkins waits for ssh access to vm >>> 4) ssh's into vm and passes vm over to job and job starts running >>> >>> In the OpenStack plugin case step 4 can execute while step 2 is still >>> running apt-get update in the background because it was a non-blocking >>> function. >>> >>> A few ideas I have to get around this. >>> >>> a) Allow init-script to continue running apt-get update however have a >>> shell script at the start of Ubuntu jobs that waits for the lock to get >>> released before allowing the job to start >>> >>> b) Remove apt-get update from init-script and make the job run apt-get >>> update at the beginning of it's execution >>> >>> c) Regularly update VMs to ensure that apt-get update always runs quickly >>> >>> Regards, >>> Thanh >>> >>> [0] >>> https://git.fd.io/ci-management/tree/jenkins-scripts/basic_settings.sh#n14 >>> >>> >>> On Thu Jan 19 19:23:59 2017, hagbard wrote: >>> >>> FYI... helpdesk is on it, and its being worked in #fdio-infra on IRC >>> >>> Ed >>> >>> On Thu, Jan 19, 2017 at 4:31 PM, Ed Warnicke <hagb...@gmail.com> >>> <hagb...@gmail.com> wrote: >>> >>> >>> Looping in help desk. >>> On Thu, Jan 19, 2017 at 4:16 PM Dave Barach (dbarach) <dbar...@cisco.com> >>> <dbar...@cisco.com> >>> wrote: >>> >>> >>> Folks, >>> >>> >>> >>> See https://jenkins.fd.io/job/vpp-verify-master-ubuntu1604/3378/console >>> >>> >>> >>> 11:00:46 E: Could not get lock /var/lib/dpkg/lock - open (11: Resource >>> temporarily unavailable) >>> >>> 11:00:46 E: Unable to lock the administration directory (/var/lib/dpkg/), >>> is another process using it? >>> >>> >>> >>> I recognize this failure from my own Ubuntu 16.04 system: a cron-job >>> starts “apt-get -q”, which for whatever reason does not terminate. As a >>> workaround, “sudo killall apt-get || true” before trying to acquire build >>> dependencies... >>> >>> >>> >>> HTH... Dave >>> >>> >>> _______________________________________________ >>> >>> vpp-dev mailing list >>> vpp-dev@lists.fd.io >>> https://lists.fd.io/mailman/listinfo/vpp-dev >>> >>> _______________________________________________ >>> vpp-dev mailing >>> listvpp-...@lists.fd.iohttps://lists.fd.io/mailman/listinfo/vpp-dev >>> >>> >>> >> >
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev