[PATCH] virt-host-validate: Drop extra "PASS"

2024-07-11 Thread Michal Privoznik
If virt-host-validate is ran on a SEV-SNP capable machine, an
extra "PASS" is printed out. This is because
virHostValidateAMDSev() prints "PASS" and then returns 1
(indicating success) which in turn makes the caller
(virHostValidateSecureGuests()) print "PASS" again. Just drop the
extra printing in the caller and let virHostValidateAMDSev() do
all the printing.

Fixes: 1a8f646f291775d2423ce4e4df62ad69f06ab827
Resolves: https://issues.redhat.com/browse/RHEL-46868
Signed-off-by: Michal Privoznik 
---
 tools/virt-host-validate-common.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/tools/virt-host-validate-common.c 
b/tools/virt-host-validate-common.c
index a29a5b6d5f..86db4815c3 100644
--- a/tools/virt-host-validate-common.c
+++ b/tools/virt-host-validate-common.c
@@ -488,11 +488,7 @@ int virHostValidateSecureGuests(const char *hvname,
 return VIR_VALIDATE_FAILURE(level);
 }
 } else if (hasAMDSev) {
-int rc = virHostValidateAMDSev(hvname, level);
-
-if (rc > 0)
-virValidatePass();
-return rc;
+return  virHostValidateAMDSev(hvname, level);
 }
 
 virValidateFail(level,
-- 
2.44.2


[PATCH] virt-host-validate: Rework calling of driver validation

2024-07-11 Thread Michal Privoznik
It all started with me looking at the --help output which also
printed "bhyve" as supported hypervisor type. Well, it's not on
my Linux machine. To resolve this, I'm just creating a static
array of { "$driver", callback() } pairs and iterating over it.
The array is then initialized at compile time with supported
drivers.

Signed-off-by: Michal Privoznik 
---
 tools/virt-host-validate.c | 80 +-
 1 file changed, 45 insertions(+), 35 deletions(-)

diff --git a/tools/virt-host-validate.c b/tools/virt-host-validate.c
index 426648a5d3..365b8acd92 100644
--- a/tools/virt-host-validate.c
+++ b/tools/virt-host-validate.c
@@ -29,6 +29,7 @@
 #include "internal.h"
 #include "virerror.h"
 #include "virgettext.h"
+#include "virglibutil.h"
 
 #include "virt-host-validate-common.h"
 #if WITH_QEMU
@@ -44,26 +45,58 @@
 # include "virt-host-validate-ch.h"
 #endif
 
+typedef struct _virValidateCallbacks virValidateCallbacks;
+struct _virValidateCallbacks {
+const char *name;
+int (*callback)(void);
+};
+
+static virValidateCallbacks validateCallbacks[] = {
+#if WITH_QEMU
+{ "qemu", virHostValidateQEMU },
+#endif
+#if WITH_LXC
+{ "lxc", virHostValidateLXC },
+#endif
+#if WITH_BHYVE
+{ "bhyve", virHostValidateBhyve },
+#endif
+#if WITH_CH
+{ "ch", virHostValidateCh },
+#endif
+};
+
 static void
 show_help(FILE *out, const char *argv0)
 {
+g_autofree char *hvs = NULL;
+char *hvs_list[G_N_ELEMENTS(validateCallbacks) + 1] = { };
+size_t i;
+
+for (i = 0; i < G_N_ELEMENTS(validateCallbacks); i++) {
+hvs_list[i] = g_strdup_printf("   - %1$s", validateCallbacks[i].name);
+}
+
+hvs = g_strjoinv("\n", hvs_list);
+
+for (i = 0; i < G_N_ELEMENTS(validateCallbacks); i++) {
+g_free(hvs_list[i]);
+}
+
 fprintf(out,
 _("\n"
   "syntax: %1$s [OPTIONS] [HVTYPE]\n"
   "\n"
   " Hypervisor types:\n"
   "\n"
-  "   - qemu\n"
-  "   - lxc\n"
-  "   - bhyve\n"
-  "   - ch\n"
+  "%2$s\n"
   "\n"
   " Options:\n"
   "   -h, --help Display command line help\n"
   "   -v, --version  Display command version\n"
   "   -q, --quietDon't display progress information\n"
   "\n"),
-argv0);
+argv0, hvs);
 }
 
 static void
@@ -87,6 +120,7 @@ main(int argc, char **argv)
 int ret = EXIT_SUCCESS;
 bool quiet = false;
 bool usedHvname = false;
+size_t i;
 
 if (virGettextInitialize() < 0 ||
 virErrorInitialize() < 0) {
@@ -126,37 +160,13 @@ main(int argc, char **argv)
 
 virValidateSetQuiet(quiet);
 
-#if WITH_QEMU
-if (!hvname || STREQ(hvname, "qemu")) {
-usedHvname = true;
-if (virHostValidateQEMU() < 0)
-ret = EXIT_FAILURE;
+for (i = 0; i < G_N_ELEMENTS(validateCallbacks); i++) {
+if (!hvname || STREQ(hvname, validateCallbacks[i].name)) {
+usedHvname = true;
+if (validateCallbacks[i].callback() < 0)
+ret = EXIT_FAILURE;
+}
 }
-#endif
-
-#if WITH_LXC
-if (!hvname || STREQ(hvname, "lxc")) {
-usedHvname = true;
-if (virHostValidateLXC() < 0)
-ret = EXIT_FAILURE;
-}
-#endif
-
-#if WITH_BHYVE
-if (!hvname || STREQ(hvname, "bhyve")) {
-usedHvname = true;
-if (virHostValidateBhyve() < 0)
-ret = EXIT_FAILURE;
-}
-#endif
-
-#if WITH_CH
-if (!hvname || STREQ(hvname, "ch")) {
-usedHvname = true;
-if (virHostValidateCh() < 0)
-ret = EXIT_FAILURE;
-}
-#endif
 
 if (hvname && !usedHvname) {
 fprintf(stderr, _("%1$s: unsupported hypervisor name %2$s\n"),
-- 
2.44.2


Re: [PATCH] virt-host-validate: Drop extra "PASS"

2024-07-11 Thread Ján Tomko

On a Thursday in 2024, Michal Privoznik wrote:

If virt-host-validate is ran on a SEV-SNP capable machine, an
extra "PASS" is printed out. This is because
virHostValidateAMDSev() prints "PASS" and then returns 1
(indicating success) which in turn makes the caller
(virHostValidateSecureGuests()) print "PASS" again. Just drop the
extra printing in the caller and let virHostValidateAMDSev() do
all the printing.

Fixes: 1a8f646f291775d2423ce4e4df62ad69f06ab827
Resolves: https://issues.redhat.com/browse/RHEL-46868
Signed-off-by: Michal Privoznik 
---
tools/virt-host-validate-common.c | 6 +-
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/tools/virt-host-validate-common.c 
b/tools/virt-host-validate-common.c
index a29a5b6d5f..86db4815c3 100644
--- a/tools/virt-host-validate-common.c
+++ b/tools/virt-host-validate-common.c
@@ -488,11 +488,7 @@ int virHostValidateSecureGuests(const char *hvname,
return VIR_VALIDATE_FAILURE(level);
}
} else if (hasAMDSev) {
-int rc = virHostValidateAMDSev(hvname, level);
-
-if (rc > 0)
-virValidatePass();
-return rc;
+return  virHostValidateAMDSev(hvname, level);


Double space^^


}

virValidateFail(level,


Reviewed-by: Ján Tomko 

Jano


signature.asc
Description: PGP signature


Re: [PATCH] virt-host-validate: Rework calling of driver validation

2024-07-11 Thread Ján Tomko

On a Thursday in 2024, Michal Privoznik wrote:

It all started with me looking at the --help output which also
printed "bhyve" as supported hypervisor type.


It did not say it is a supported hypervisor type, it merely listed it as
a supported option. I don't think we need this patch.

It works as expected on my Linux machine:

$ virt-host-validate bhyve
virt-host-validate: unsupported hypervisor name bhyve

Jano


Well, it's not on
my Linux machine. To resolve this, I'm just creating a static
array of { "$driver", callback() } pairs and iterating over it.
The array is then initialized at compile time with supported
drivers.

Signed-off-by: Michal Privoznik 
---
tools/virt-host-validate.c | 80 +-
1 file changed, 45 insertions(+), 35 deletions(-)

diff --git a/tools/virt-host-validate.c b/tools/virt-host-validate.c
index 426648a5d3..365b8acd92 100644
--- a/tools/virt-host-validate.c
+++ b/tools/virt-host-validate.c
@@ -29,6 +29,7 @@
#include "internal.h"
#include "virerror.h"
#include "virgettext.h"
+#include "virglibutil.h"

#include "virt-host-validate-common.h"
#if WITH_QEMU
@@ -44,26 +45,58 @@
# include "virt-host-validate-ch.h"
#endif

+typedef struct _virValidateCallbacks virValidateCallbacks;
+struct _virValidateCallbacks {
+const char *name;
+int (*callback)(void);
+};
+
+static virValidateCallbacks validateCallbacks[] = {
+#if WITH_QEMU
+{ "qemu", virHostValidateQEMU },
+#endif
+#if WITH_LXC
+{ "lxc", virHostValidateLXC },
+#endif
+#if WITH_BHYVE
+{ "bhyve", virHostValidateBhyve },
+#endif
+#if WITH_CH
+{ "ch", virHostValidateCh },
+#endif
+};
+
static void
show_help(FILE *out, const char *argv0)
{
+g_autofree char *hvs = NULL;
+char *hvs_list[G_N_ELEMENTS(validateCallbacks) + 1] = { };
+size_t i;
+
+for (i = 0; i < G_N_ELEMENTS(validateCallbacks); i++) {
+hvs_list[i] = g_strdup_printf("   - %1$s", validateCallbacks[i].name);
+}
+
+hvs = g_strjoinv("\n", hvs_list);
+
+for (i = 0; i < G_N_ELEMENTS(validateCallbacks); i++) {
+g_free(hvs_list[i]);
+}
+
fprintf(out,
_("\n"
  "syntax: %1$s [OPTIONS] [HVTYPE]\n"
  "\n"
  " Hypervisor types:\n"
  "\n"
-  "   - qemu\n"
-  "   - lxc\n"
-  "   - bhyve\n"
-  "   - ch\n"
+  "%2$s\n"
  "\n"
  " Options:\n"
  "   -h, --help Display command line help\n"
  "   -v, --version  Display command version\n"
  "   -q, --quietDon't display progress information\n"
  "\n"),
-argv0);
+argv0, hvs);
}

static void
@@ -87,6 +120,7 @@ main(int argc, char **argv)
int ret = EXIT_SUCCESS;
bool quiet = false;
bool usedHvname = false;
+size_t i;

if (virGettextInitialize() < 0 ||
virErrorInitialize() < 0) {
@@ -126,37 +160,13 @@ main(int argc, char **argv)

virValidateSetQuiet(quiet);

-#if WITH_QEMU
-if (!hvname || STREQ(hvname, "qemu")) {
-usedHvname = true;
-if (virHostValidateQEMU() < 0)
-ret = EXIT_FAILURE;
+for (i = 0; i < G_N_ELEMENTS(validateCallbacks); i++) {
+if (!hvname || STREQ(hvname, validateCallbacks[i].name)) {
+usedHvname = true;
+if (validateCallbacks[i].callback() < 0)
+ret = EXIT_FAILURE;
+}
}
-#endif
-
-#if WITH_LXC
-if (!hvname || STREQ(hvname, "lxc")) {
-usedHvname = true;
-if (virHostValidateLXC() < 0)
-ret = EXIT_FAILURE;
-}
-#endif
-
-#if WITH_BHYVE
-if (!hvname || STREQ(hvname, "bhyve")) {
-usedHvname = true;
-if (virHostValidateBhyve() < 0)
-ret = EXIT_FAILURE;
-}
-#endif
-
-#if WITH_CH
-if (!hvname || STREQ(hvname, "ch")) {
-usedHvname = true;
-if (virHostValidateCh() < 0)
-ret = EXIT_FAILURE;
-}
-#endif

if (hvname && !usedHvname) {
fprintf(stderr, _("%1$s: unsupported hypervisor name %2$s\n"),
--
2.44.2



signature.asc
Description: PGP signature


[PATCH] qemu: Don't leave beingDestroyed=true on inactive domain

2024-07-11 Thread Jiri Denemark
Recent commit v10.4.0-87-gd9935a5c4f made a reasonable change to only
reset beingDestroyed back to false when vm->def->id is reset to make
sure other code can detect a domain is (about to become) inactive. It
even added a comment saying any caller of qemuProcessBeginStopJob is
supposed to call qemuProcessStop to clear beingDestroyed. But not every
caller really does so because they first call qemuProcessBeginStopJob
and then check whether a domain is still running. If not the
qemuProcessStop call is skipped leaving beingDestroyed=true. In case of
a persistent domain this may block incoming migrations of such domain as
the migration code would think the domain died unexpectedly (even though
it's still running).

The qemuProcessBeginStopJob function is a wrapper around
virDomainObjBeginJob, but virDomainObjEndJob was used directly for
cleanup. This patch introduces a new qemuProcessEndStopJob wrapper
around virDomainObjEndJob to properly undo everything
qemuProcessBeginStopJob did.

https://issues.redhat.com/browse/RHEL-43309

Signed-off-by: Jiri Denemark 
---
 src/qemu/qemu_driver.c  |  4 ++--
 src/qemu/qemu_process.c | 20 
 src/qemu/qemu_process.h |  1 +
 3 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index f898d85667..9f3013e231 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -2102,7 +2102,7 @@ qemuDomainDestroyFlags(virDomainPtr dom,
  endjob:
 if (ret == 0)
 qemuDomainRemoveInactive(driver, vm, 0, false);
-virDomainObjEndJob(vm);
+qemuProcessEndStopJob(vm);
 
  cleanup:
 virDomainObjEndAPI(&vm);
@@ -3888,7 +3888,7 @@ processMonitorEOFEvent(virQEMUDriver *driver,
 
  endjob:
 qemuDomainRemoveInactive(driver, vm, 0, false);
-virDomainObjEndJob(vm);
+qemuProcessEndStopJob(vm);
 }
 
 
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 7cbe521a6e..25dfd04272 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -8421,7 +8421,8 @@ qemuProcessKill(virDomainObj *vm, unsigned int flags)
  * qemuProcessBeginStopJob:
  *
  * Stop all current jobs by killing the domain and start a new one for
- * qemuProcessStop.
+ * qemuProcessStop. The caller has to make sure qemuProcessEndStopJob is
+ * called to properly cleanup the job.
  */
 int
 qemuProcessBeginStopJob(virDomainObj *vm,
@@ -8448,8 +8449,9 @@ qemuProcessBeginStopJob(virDomainObj *vm,
 goto error;
 
 /* priv->beingDestroyed is deliberately left set to 'true' here. Caller
- * is supposed to call qemuProcessStop, which will reset it after
- * 'vm->def->id' is set to -1 */
+ * is supposed to call qemuProcessStop (which will reset it after
+ * 'vm->def->id' is set to -1) and/or qemuProcessEndStopJob to do proper
+ * cleanup. */
 return 0;
 
  error:
@@ -8458,6 +8460,16 @@ qemuProcessBeginStopJob(virDomainObj *vm,
 }
 
 
+void
+qemuProcessEndStopJob(virDomainObj *vm)
+{
+if (!virDomainObjIsActive(vm))
+QEMU_DOMAIN_PRIVATE(vm)->beingDestroyed = false;
+
+virDomainObjEndJob(vm);
+}
+
+
 void qemuProcessStop(virQEMUDriver *driver,
  virDomainObj *vm,
  virDomainShutoffReason reason,
@@ -8800,7 +8812,7 @@ qemuProcessAutoDestroy(virDomainObj *dom,
 
 qemuDomainRemoveInactive(driver, dom, 0, false);
 
-virDomainObjEndJob(dom);
+qemuProcessEndStopJob(dom);
 
 virObjectEventStateQueue(driver->domainEventState, event);
 }
diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h
index c1ea949215..cb67bfcd2d 100644
--- a/src/qemu/qemu_process.h
+++ b/src/qemu/qemu_process.h
@@ -169,6 +169,7 @@ typedef enum {
 int qemuProcessBeginStopJob(virDomainObj *vm,
 virDomainJob job,
 bool forceKill);
+void qemuProcessEndStopJob(virDomainObj *vm);
 void qemuProcessStop(virQEMUDriver *driver,
  virDomainObj *vm,
  virDomainShutoffReason reason,
-- 
2.45.2


Re: [PATCH] qemu: Don't leave beingDestroyed=true on inactive domain

2024-07-11 Thread Peter Krempa
On Thu, Jul 11, 2024 at 14:55:17 +0200, Jiri Denemark wrote:
> Recent commit v10.4.0-87-gd9935a5c4f made a reasonable change to only
> reset beingDestroyed back to false when vm->def->id is reset to make
> sure other code can detect a domain is (about to become) inactive. It
> even added a comment saying any caller of qemuProcessBeginStopJob is
> supposed to call qemuProcessStop to clear beingDestroyed. But not every
> caller really does so because they first call qemuProcessBeginStopJob
> and then check whether a domain is still running. If not the
> qemuProcessStop call is skipped leaving beingDestroyed=true. In case of
> a persistent domain this may block incoming migrations of such domain as
> the migration code would think the domain died unexpectedly (even though
> it's still running).
> 
> The qemuProcessBeginStopJob function is a wrapper around
> virDomainObjBeginJob, but virDomainObjEndJob was used directly for
> cleanup. This patch introduces a new qemuProcessEndStopJob wrapper
> around virDomainObjEndJob to properly undo everything
> qemuProcessBeginStopJob did.
> 
> https://issues.redhat.com/browse/RHEL-43309
> 
> Signed-off-by: Jiri Denemark 
> ---
>  src/qemu/qemu_driver.c  |  4 ++--
>  src/qemu/qemu_process.c | 20 
>  src/qemu/qemu_process.h |  1 +
>  3 files changed, 19 insertions(+), 6 deletions(-)

For consistency you should also fix qemuProcessAutoDestroy which uses
qemuProcessBeginStopJob even when it doesn't jump out before calling
qemuProcessStop.

Reviewed-by: Peter Krempa 


Re: [PATCH] security: AppArmor allow write when os loader readonly=no

2024-07-11 Thread Andrea Bolognani
On Tue, Jul 09, 2024 at 08:41:17AM GMT, mirlos--- via Devel wrote:
> My reply by email has not arrived by now, hence I'll post it via
> the archive site. Sorry for the potential double post.

No double post as far as I can tell :)

> Older bootloaders were not split into separate _CODE.fd and _VARS.fd,
> i.e. there was no separate nvram for the latter file to create. The guest
> could write to the single bootloader, which then must not be shared.
>
> You mark such bootloaders readonly=no and make a copy of the pflash,
> e.g. next to the VM's disk files, as you did in your TEST ONLY run.
>
> It is a mode of operation supported by the formatdomain documentation
> on the loader element and intended to work as described. This patch
> makes such combination of parameters actually succeed on Ubuntu,
> which I find should be useful to the project.
>
> In the VMs we use this for, they do not actually write anything to the loader.
> This meant that we never noticed the bug, which was present in focal and
> configured qemu to open the loader read-only anyway. But it failed on AA
> in noble since the bug is now fixed in newer libvirt.

So the behavior changed between libvirt 6.0.0 in Ubuntu 20.04 and
libvirt 10.0.0 in Ubuntu 24.04. That's not surprising, since I've
done a lot of work to fix and improve firmware handling around the
9.2.0 timeframe.

You seem to identify the change of behavior as a bug fix, which
matches my understanding too.

> As a workaround, we've in the mean time switched to marking the loader
> stateless and read-only, which is now allowed in libvirt, and also works
> without requiring a separate nvram. Of course, this only works because
> the VM does not make any writes and would fail in case it needed to.

Right, that ought to do the trick.

As far as I'm concerned, I'm satisfied with the explanation you've
provided so the patch gets my

  Reviewed-by: Andrea Bolognani 

Christian, any objection to pushing it?

-- 
Andrea Bolognani / Red Hat / Virtualization


Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)

2024-07-11 Thread Fabiano Rosas
Peter Xu  writes:

> On Wed, Jul 10, 2024 at 06:38:26PM -0300, Fabiano Rosas wrote:
>> Peter Xu  writes:
>> 
>> > On Wed, Jul 10, 2024 at 04:48:23PM -0300, Fabiano Rosas wrote:
>> >> Peter Xu  writes:
>> >> 
>> >> > On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote:
>> >> >> It's not about trust, we simply don't support migrations other than
>> >> >> n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included.
>> >> >
>> >> > Where does it come from?  I thought we suppport that..
>> >> 
>> >> I'm taking that from:
>> >> 
>> >> docs/devel/migration/main.rst:
>> >>   "In general QEMU tries to maintain forward migration compatibility
>> >>   (i.e. migrating from QEMU n->n+1) and there are users who benefit from
>> >>   backward compatibility as well."
>> >> 
>> >> But of course it doesn't say whether that comes with a transitive rule
>> >> allowing n->n+2 migrations.
>> >
>> > I'd say that "i.e." implies n->n+1 is not the only forward migration we
>> > would support.
>> >
>> > I _think_ we should support all forward migration as long as the machine
>> > type matches.
>> >
>> >> 
>> >> >
>> >> > The same question would be: are we requesting an OpenStack cluster to
>> >> > always upgrade QEMU with +1 versions, otherwise migration will fail?
>> >> 
>> >> Will an OpenStack cluster be using upstream QEMU? If not, then that's a
>> >
>> > It's an example to show what I meant! :) Nothing else. Definitely not
>> > saying that everyone should use an upstream released QEMU (but in reality,
>> > it's not a problem, I think, and I do feel like people use them, perhaps
>> > more with the stable releases).
>> >
>> >> question for the distro. In a very practical sense, we're not requesting
>> >> anything. We barely test n->n+1/n->n-1, even if we had a strong support
>> >> statement I wouldn't be confident saying migration from QEMU 2.7 -> QEMU
>> >> 9.1 should succeed.
>> >
>> > No matter what we test in CI, I don't think we should break that for >1
>> > versions..  I hope 2.7->9.1 keeps working, otherwise I think it's legal to
>> > file a bug by anyone.
>> >
>> > For example, I randomly fetched a bug report:
>> >
>> > https://gitlab.com/qemu-project/qemu/-/issues/1937
>> >
>> > QEMU version:6.2 and 7.2.5
>> >
>> > And I believe that's the common case even for upstream.  If we don't do
>> > that right for upstream, it can be impossible tasks for downstream and for
>> > all of us to maintain.
>> 
>> But do we do that right currently? I have no idea. Have we ever done
>> it? And we're here discussing a hypothetical 2.7->9.1 ...
>> 
>> So we cannot reuse the UNUSED field because QEMU from 2016 might send
>> their data and QEMU from 2024 would interpret it wrong.
>> 
>> How do we proceed? Add a subsection. And make the code survive when
>> receiving 0.
>> 
>> @Peter is that it? What about backwards-compat? We'll need a property as
>> well it seems.
>
> Compat property is definitely one way to go, but I think it's you who more
> or less persuaded me that reusing it seems possible! At least I can't yet
> think of anything bad if it's ancient unused buffers.

Since we're allowing any old QEMU version to migrate to the most recent
one, we need to think of the data that was there before the introduction
of the UNUSED field. If that QEMU version is used, then it's not going
to be zeroes on the stream, but whatever data was there before. The new
QEMU will be expecting the vendor_data introduced in this patch.

> And that's why I was asking about a sane way to describe the "magic
> year".. And I was very serious when I said "6 years" to follow the
> deprecation of machine types, because it'll be something we can follow
> to say when an unused buffer can be obsolete and it could make some
> sense: if we will start to deprecate machine types, then it may not
> make sense to keep any UNUSED compatible forever, after all.
>

Is there an easy way to look at a field and tell in which machine type's
timeframe it was introduced? If the machine type of that era has been
removed, then the field is free to go as well. I'd prefer if we had a
hard link instead of just counting years. Maybe we should to that
mapping at the machine deprecation time? As in, "look at the unused
fields introduced in that timeframe and mark them free".

> And we need that ruler to be as accurate as "always 6 years to follow
> machine type deprecation procedure", in case someone else tomorrow asks us
> something that was only UNUSED since 2018.  We need a rule of thumb if we
> want to reuse it, and if all of you agree we can start with this one,
> perhaps with a comment above the field (before we think all through and
> make it a rule on deprecating UNUSED)?


Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)

2024-07-11 Thread Peter Xu
On Thu, Jul 11, 2024 at 10:34:12AM -0300, Fabiano Rosas wrote:
> Is there an easy way to look at a field and tell in which machine type's
> timeframe it was introduced?

I am not aware of any.

> If the machine type of that era has been removed, then the field is free
> to go as well. I'd prefer if we had a hard link instead of just counting
> years. Maybe we should to that mapping at the machine deprecation time?
> As in, "look at the unused fields introduced in that timeframe and mark
> them free".

We can do that, but depending on how easy it would be. That can be an
overkill to me if it's non-trivial.  When it becomes complicated, I'd
rather make machine compat property easier to use so we always stick with
that.  Currently it's not as easy to use.

Maybe we shouldn't make it a common rule to let people reuse the UNUSED
fields, even if in this case it's probably fine?

E.g. I don't think it's a huge deal to keep all UNUSED fields forever -
sending 512B zeros for only one specific device isn't an issue even if kept
forever.

If "over 6 years" would be okay and simple enough, then maybe we can stick
with that (and only if people would like to reuse a field and ask; that's
after all not required..).  If more than that I doubt whether we should
spend time working on covering all the fields.

-- 
Peter Xu


Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)

2024-07-11 Thread Fabiano Rosas
Peter Xu  writes:

> On Thu, Jul 11, 2024 at 10:34:12AM -0300, Fabiano Rosas wrote:
>> Is there an easy way to look at a field and tell in which machine type's
>> timeframe it was introduced?
>
> I am not aware of any.
>
>> If the machine type of that era has been removed, then the field is free
>> to go as well. I'd prefer if we had a hard link instead of just counting
>> years. Maybe we should to that mapping at the machine deprecation time?
>> As in, "look at the unused fields introduced in that timeframe and mark
>> them free".
>
> We can do that, but depending on how easy it would be. That can be an
> overkill to me if it's non-trivial.  When it becomes complicated, I'd
> rather make machine compat property easier to use so we always stick with
> that.  Currently it's not as easy to use.
>
> Maybe we shouldn't make it a common rule to let people reuse the UNUSED
> fields, even if in this case it's probably fine?
>
> E.g. I don't think it's a huge deal to keep all UNUSED fields forever -
> sending 512B zeros for only one specific device isn't an issue even if kept
> forever.
>
> If "over 6 years" would be okay and simple enough, then maybe we can stick
> with that (and only if people would like to reuse a field and ask; that's
> after all not required..).  If more than that I doubt whether we should
> spend time working on covering all the fields.

I'm fine with a simple rule.

But of course, that means we cannot claim to support all kinds of
forward migrations anymore. Only those in the 6 year period.


Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)

2024-07-11 Thread Peter Xu
On Thu, Jul 11, 2024 at 11:44:08AM -0300, Fabiano Rosas wrote:
> But of course, that means we cannot claim to support all kinds of
> forward migrations anymore. Only those in the 6 year period.

That "6 years" comes from machine type deprecation period, and migration
compatibility is mostly only attached to machine types, and we only ever
allowed migration with the same machine type.

It means, >6 years migration will never work anyway as soon as we start to
deprecate machine types (irrelevant of any reuse of UNUSED), because the >6
years machine types will simply be gone.. See configuration_post_load(),
where it'll throw an error upfront when machine type mismatched.

-- 
Peter Xu


Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)

2024-07-11 Thread Fabiano Rosas
Peter Xu  writes:

> On Thu, Jul 11, 2024 at 11:44:08AM -0300, Fabiano Rosas wrote:
>> But of course, that means we cannot claim to support all kinds of
>> forward migrations anymore. Only those in the 6 year period.
>
> That "6 years" comes from machine type deprecation period, and migration
> compatibility is mostly only attached to machine types, and we only ever
> allowed migration with the same machine type.
>
> It means, >6 years migration will never work anyway as soon as we start to
> deprecate machine types (irrelevant of any reuse of UNUSED), because the >6
> years machine types will simply be gone.. See configuration_post_load(),
> where it'll throw an error upfront when machine type mismatched.

Yes, duh! What am I talking about...