What kind of hardware is this that's connected to the servers, and what does the software do that you can't test before installing on production servers?
On 6 August 2016 at 02:14, Elazar Leibovich <elaz...@gmail.com> wrote: > All real servers, with custom hardware attached, geographically > distributed across the planet. > > Real people actually use the hardware attached to this computers, and it's > not obvious to test whether or not it failed. > > The strategy therefor is, deploy randomly to small percentage of the > machines, wait to see if you get complains from those customers using these > hardware devices, and if everything went well, update the rest of the > servers. > > The provisioning solution is chef, but I'm open to changing it. As I said, > I don't think it makes too much difference. > > As of immutable server images, I'd do it with ZFS/brtfs snapshots > (+docker/machinectl/systemd-nspawn if you must have some sort of virtual > environment), but it's probably a better idea than apt-get install > pkg=oldversion. Immutable filesystem for execution is of course not enough, > since you might have migrations for the mutable part, etc. In this > particular case, I don't think it's a big deal. > > You see, not everything is a web startup with customer facing website ;-) > > Thanks, > Appreciate you sharing your experience. > I'm not disagreeing with your points, but in this particular case, where > testing is expensive, not all of them seems valid. > > On Fri, Aug 5, 2016 at 3:15 PM, Amos Shapira <amos.shap...@gmail.com> > wrote: > >> What provisioning tools do you use to manage these servers? Please tell >> me you aren't doing all of this manually. >> Also what's your environment? All hardware servers? Any virtualisation >> involved? Cloud servers? >> >> Reading your question it feels like you are setting yourself up to fail >> instead of minimising the failure altogether. >> >> What I suggest is that you test your package automatically in a test >> environment (to me, Vagrant + Rspec/ServerSpec would be first candidates to >> check) then rollout the package to the repository for the servers to pick >> it up. >> >> As for "roll-back" - with comprehensive automatic testing this concept is >> becoming obsolete, there is no such thing as "roll-back" only >> "roll-forward", i.e. since the testing and rolling out are small and >> "cheap", it should be feasible to fix whatever problem was found instead of >> having to revert the change altogether. >> >> If you are in a properly supported virtual environment then I'd even go >> for immutable server images (e.g. Packer building AMI's, or Docker >> containers), then it's a matter of just firing up an instance of the new >> image both when testing and in production. >> >> --Amos >> >> On 3 August 2016 at 16:55, Elazar Leibovich <elaz...@gmail.com> wrote: >> >>> How exactly you connect to the server is not in the scope of the >>> discussion, and I agree that ansible is a sensible solution. >>> >>> But what you're proposing is to manually update the package on a small >>> percent of the machines. >>> >>> Manual solution is fine, but I would like to hear experience of people >>> who actually did that on many servers. >>> >>> There are many other issues, for example, how to you roll back? >>> >>> apt-get remove exposes you to the risk that the uninstallation script >>> would be buggy. There are other solutions, e.g., btrfs snapshots on root >>> partitions, but I'm curious to hear someone experienced with it to expose >>> issues I didn't even thought of. >>> >>> Another issue is, how do you select the servers you try it? >>> >>> You suggested a static "beta" list, and I think it's better to select >>> the candidates randomly on each update. >>> >>> Anyhow, how exactly you connect to the server is not the essence of the >>> issue. >>> >>> On Wed, Aug 3, 2016 at 9:30 AM, Evgeniy Ginzburg <nad....@gmail.com> >>> wrote: >>> >>>> Hello. >>>> I'm assuming that you have paswordless ssh to the servers in question >>>> as root. >>>> Also I assume that you don't use central management/deployment software >>>> (ansible/puppet/chef) >>>> In similar cases I usully use parallel-ssh (gnu-parallel is another >>>> alternative). >>>> First stage install the package manually on one server to see that >>>> configuration is OK, daemons restart, etc... >>>> If this stage is ok second step will be creating list of servers for >>>> "complain" list and install package on them trough parallel-ssh. >>>> Instead of waiting for complains, one can define metrics to check and >>>> use some monitoring appliance for verification. >>>> I case of failure remove package from repository and remove-install >>>> again. >>>> Third will be parallel-ssh install on all the servers. >>>> >>>> P. S. In case of few tens of servers I'd prefer to work with ansible or >>>> alternative, it's worh it in most cases/ >>>> >>>> Best Regards, Evgeniy. >>>> >>>> >>>> On Tue, Aug 2, 2016 at 8:50 PM, Elazar Leibovich <elaz...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm having a few (say, a few tens) Debian machines, with a local >>>>> repository defined. >>>>> >>>>> In the local repository I have some home made packages I'm building >>>>> and pushing to the local repository. >>>>> >>>>> When I'm upgrading my package, I want to be sure the update wouldn't >>>>> cause a problem. >>>>> >>>>> So I wish to install them on a few percentage of the machines, wait >>>>> for complaints. >>>>> >>>>> If complaints arrive - roll back. >>>>> Otherwise keep upgrading the whole machines. >>>>> >>>>> I'll appreciate your advice and experience of similar situation, >>>>> I'll appreciate if someone who had actual real life experience with >>>>> this situation would mention it in the comments. >>>>> >>>>> Thanks, >>>>> >>>>> _______________________________________________ >>>>> Linux-il mailing list >>>>> Linux-il@cs.huji.ac.il >>>>> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il >>>>> >>>>> >>>> >>>> >>>> -- >>>> So long, and thanks for all the fish. >>>> >>> >>> >>> _______________________________________________ >>> Linux-il mailing list >>> Linux-il@cs.huji.ac.il >>> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il >>> >>> >> >> >> -- >> <http://au.linkedin.com/in/gliderflyer> >> > > -- <http://au.linkedin.com/in/gliderflyer>
_______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il