On Fri, 15 May 2015 09:13:46 +0200 "Michael S. Tsirkin" <m...@redhat.com> wrote:
> On Fri, May 15, 2015 at 09:08:07AM +0200, Christian Borntraeger wrote: > > Am 14.05.2015 um 19:00 schrieb Dr. David Alan Gilbert: > > > * Christian Borntraeger (borntrae...@de.ibm.com) wrote: > > >> Am 14.05.2015 um 11:36 schrieb Michael S. Tsirkin: > > >>> On Thu, May 14, 2015 at 11:22:13AM +0200, Christian Borntraeger wrote: > > >>>> Am 13.05.2015 um 23:47 schrieb Michael S. Tsirkin: > > >>>>> On Wed, May 13, 2015 at 08:57:00PM +0200, Christian Borntraeger wrote: > > >>>>>> Am 13.05.2015 um 18:14 schrieb Michael S. Tsirkin: > > >>>>>>>> - AFAICS, there's no easy way to add transport-specific > > >>>>>>>> subsections - > > >>>>>>>> and simply adding config_vector in ccw would break compatibility > > >>>>>>> > > >>>>>>> subsections break compatibility too. The only way around that is > > >>>>>>> to set > > >>>>>>> a flag to skip migrating config_vector for old machine types. > > >>>>>> > > >>>>>> My main concern is about undetected compatibility issues. A > > >>>>>> subsection will > > >>>>>> tell the user that something went wrong. What happens if we just add > > >>>>>> a new > > >>>>>> qemu_put_byte in the stream. Will the savevm core always detect that > > >>>>>> we have > > >>>>>> too many or not enough bytes? If yes, adding new stuff in the stream > > >>>>>> will > > >>>>>> always be detected in some way as error we can go with just adding > > >>>>>> qemu_put_be16/qemu_get_be16 in > > >>>>>> virtio_ccw_save_config/virtio_ccw_load_config. > > >>>>>> Old/new QEMUs will then not be compatible - but thats probably ok as > > >>>>>> long as it > > >>>>>> errors out. > > >>>>>> > > >>>>>> My understanding was that we do not have a guarentee that this will > > >>>>>> be > > >>>>>> detected all the time and having random junk in some variables is a > > >>>>>> debugging > > >>>>>> nightmare. Is that correct? > > >>>>>> > > >>>>>> > > >>>>>> Christian > > >>>>> > > >>>>> It's not too bad - normally there's a bunch of strings that > > >>>>> helps you find out what's going on. > > >>>>> But if you really care about debuggability of migration streams, help > > >>>>> move > > >>>>> forward dgilbert's RFC that switched to a self-delimiting format. > > >>>>> Just piling up random hacks in virtio seems like a wrong approach. > > >>>>> > > >>>> > > >>>> Thats not my question. PLEASE try to understand my question. > > >>>> I want a hard stop if migration changes in incompatible ways. > > >>>> If adding a qemu_put_byte in virtio_ccw gets detected we can just fix > > >>>> virtio_ccw AS YOU SUGGESTED. I just want to know if I can rely on that > > >>>> or not. > > >>>> > > >>>> Christian > > >>> > > >>> I answered exactly this question but let me try to spell the answer > > >>> out a bit more. > > >>> > > >>> There are three answers: > > >>> 1. Yes, it's sure to get detected because everything gets shifted > > >>> and then you get an unexpected string instead of next device name. > > >> > > >> Ok got it. So simply adding a qemu_put/get_byte will always fail the > > >> migration and we > > >> can just fixup virtio-ccw.c at the cost of being not migrateable between > > >> versions before/after that change. > > > > > > Gahhh! No! Adding an extra byte into the stream causes random horrible > > > failures > > > that get very very confusing. Yes, it will probably fail, but how it will > > > fail and what error you get is just guess work. > > > (And note, it's strictly a 'probably fail' - if you happen to land with > > > a zero byte where you expect the start of the next section the migration > > > code will think it's the end of the migration stream and blissfully start > > > the CPUs). > > > > As Conny is away today, I will drive the dicussion a bit further :-) > > So we really want a feature that detects this change and prevents migration. > > I think its totally fine to not be able to migrate between todays QEMUs and > > a fixed version for s390 as there no supported environment today I am aware > > of. What would be the preferred way to go? > > a: Connies approach of a subsection that is only migrated if necessary > > (config vector != 0xffff) Would the subsection approach be more acceptable if I default needed to false and have only virtio-ccw override it in a callback? > > b: change virtio-ccw.c with put/get_be16 and make a new version of the > > s390-ccw machine? The old version will set a property to not migrate the > > config > > vector. (like Michaels 2nd suggestion) I'm still not sure mucking with individual values is a good idea. > > c: ? > > c. Set old machine as non-migrateable. Overkill? > > I prefer b or c. I still prefer a :) - mainly because I think introducing subsections makes it more obvious what is happening. It's a pity I can't introduce a subsection in ccw alone, but have to go to the core for that.