Long ago it was a disk. The problem was that these disks had to go somewhere, a place where they could survive migrations, which didn't work well for block based primary storage... at least for the code base at the time. Using virtio socket was seen as a fairly standard way to communicate temporary information to the guest, and didn't require managing the lifecycle of a special disk.
I believe the current problem is that the sender needs to remain connected until the receiver has read. Maybe socat does this, but if so we need to ensure that it is available and applied as a new RPM dependency. In my testing, waiting on the sender side didn't 100% fix things, or sometimes took a very long time due to the backoff algorithm on the cloud-early-config receiver. Some tweaks to that made it more robust, but it is still a game of trying to coordinate timing of two services on either end. If it works though, I'm all for it. Just to throw another idea out there... If we want to fix this without involving storage, I might suggest switching to the qemu-guest-agent that now exists, with a socket and listening client already in the system vm. This would be far more robust, I think, than our scripting reading unix sockets without any sort of protocol or buffer control considerations, and would likely be more robust to changes in qemu as the guest agent is the primary target for the feature. We can directly write our /var/cache/cloud/cmdline from the host like so (I'm using virsh but we could perhaps communicate with the guest agent socket directly or via socat): virsh qemu-agent-command 19 '{"execute":"guest-file-open", "arguments":{"path":"/tmp/testfile","mode":"w+"}}' {"return":1001} virsh qemu-agent-command 19 '{"execute":"guest-file-write", "arguments":{"handle":1001,"buf-b64":"Zm9vIHdhcyBoZXJlCg=="}}' {"return":{"count":13,"eof":false}} virsh qemu-agent-command 19 '{"execute":"guest-file-close", "arguments":{"handle":1001}}' {"return":{}} root@r-54850-VM:~# cat /tmp/testfile foo was here We are also able to detect via libvirt that the qemu guest agent is up and ready. You can see it in the XML when you list a VM. We do need to keep other hypervisors in mind. This is just an option for a fix that doesn't involve a larger redesign. On Fri, Apr 12, 2019 at 10:21 AM Rohit Yadav <rohit.ya...@shapeblue.com> wrote: > Hi Simon, > > > I'm exploring a solution for the same, I've found that the python based > patching script fails to wait for the message to be written on the unix > socket before that the socket is closed. I reckon this could be related to > serial port device handling related changes in qemu-ev 2.12, as the same > mechanism used to work in past versions. > > > I'm exploring/testing a solution where I replace the python based patching > script into a bash one. Can you test the following in your envrionment > (ensure socat is installed), just backup and replace the patchviasocket.py > file with this: > > https://gist.github.com/rhtyd/aab23357fef2d8a530c0e83ec8be10c5 > > > The short term solution would be one of the ways to ensure patching works > without much change in the scripts or systemvmtemplate. However, longer > term we need to explore and standardize patching mechanism across all > hypervisors, for example by using a small payload via a config drive iso. > > > Regards, > > Rohit Yadav > > Software Architect, ShapeBlue > > https://www.shapeblue.com > > ________________________________ > From: Simon Weller <swel...@ena.com.INVALID> > Sent: Friday, April 12, 2019 8:29:04 PM > To: dev; users > Subject: Latest Qemu KVM EV appears to be broken with ACS > > All, > > After troubleshooting a strange issue with a new lab environment > yesterday, it appears that the patchviasocket functionality we rely on for > key and ip injection into our router/SSVM/CPVM images is broken with > qemu-kvm-ev-2.12.0-18.el7 (January 2019 release). This was tested on Centos > 7.6. > No data is injected and this was confirmed using socat on /dev/vport0p1. > qemu-kvm-ev-2.10.0-21.el7_5.7.1 works, so hopefully this will save someone > some pain and suffering trying to figure out why the deployed seems broken. > > We're going to dig in and see if can figure out the patches responsible > for it breaking. > > -Si > > > > rohit.ya...@shapeblue.com > www.shapeblue.com > Amadeus House, Floral Street, London WC2E 9DPUK > @shapeblue > > > >