[warning: long email] On Wed, Jul 3, 2024 at 2:48 PM Marc-André Lureau <marcandre.lur...@gmail.com> wrote: > > Hi > > (adding Sebastian, one of the glib-rs developers in CC) > > On Mon, Jul 1, 2024 at 7:02 PM Paolo Bonzini <pbonz...@redhat.com> wrote: >> >> The qemu::util::foreign module provides: >> >> - A trait for structs that can be converted to a C ("foreign") representation >> - A trait for structs that can be built from a C ("foreign") representation >> - A wrapper for a pointer that automatically frees the contained data. > > You worry about technical debt, and I do too. Here you introduce quite > different traits than what glib-rs offers. > We already touched this subject 2y ago, my opinion didn't change much > (https://patchew.org/QEMU/20210907121943.3498701-1-marcandre.lur...@redhat.com/20210907121943.3498701-13-marcandre.lur...@redhat.com/)
Hi Marc-André, first of all thanks for reviewing and, probably, sorry for not going more into detail as to why I reinvented this particular wheel. Like two years ago, I find the full/none nomenclature very confusing and there is no trace of them in QEMU, but I could get over that easily. But also, I think the two have different focus and design. All I wanted was a place to put common code to convert Rust objects (mostly CStr/String/str and Error) to and from C representations (char * and QEMU's Error*), to avoid rewriting it over and over. So I wanted the traits to be as simple as possible, and I didn't like that glib puts to_glib_none and to_glib_full into the same trait, though I understand why it does so. I also didn't like that to_glib_full() is an express ticket to a memory leak. :) OwnedPointer<> and free_foreign solve all the qualms I had with glib-rs: - OwnedPointer<> takes care of freeing at the cost of extra verbosity for ".as_ptr()". - because clone_to_foreign() does not leak, I don't need to think about to_glib_none() from the get go. This has a ripple effect on other parts of the design. For example, because glib has to_glib_none(), its basic design is to never allocate malloc-ed memory unless ownership of the pointer is passed to C. glib-rs's to_glib_none can and will "clone to Rust-managed memory" if needed. Instead, because I have free_foreign(), I can malloc freely. In the future we could add something like Stash/to_glib_none(), but it would be very different and with performance properties enforced by the type system. It would "create a pointer into *existing* Rust-managed memory" cheaply but, every time a cheap borrow is not possible, users will have to call clone_to_foreign() explicitly. In other words there will be a difference between OwnedPointer<> as the result of an expensive operation, and [the equivalent of glib-rs] Stash as the result of a cheap operation. This is unlike to_glib_none() which is e.g. cheap for CStr but not for String. More on this below. > Also, you don't offer the equivalent of "to_glib_none" which uses a temporary > stash and is quite useful, as a majority of functions don't take ownership. Right, in part you have already answered that: OwnedPointer handles freeing in the same way as g_autofree, so you can use it as a to_glib_none() replacement. If you need to pass the pointer and forget about it, you can use clone_to_foreign_ptr(). If you need something temporary that needs to be freed before returning, you use clone_to_foreign() and as_ptr(). This is not any more unsafe than to_glib_none(). If you pass a temporary pointer (x.clone_to_foreign().as_ptr()) where C code expects full ownership, you get a dangling pointer either way. It is less leak-happy than to_glib_full() though. Regarding how to obtain a pointer cheaply, there are two use cases: - borrow: fn borrow_foreign<'a>(&'a self) -> Stash<'a, <Self as CloneToForeign>::Foreign, &'a Self>; Useful for slices and elementary integer types - including cases where C parameters are passed by reference - and also for CStr. However right now my main user is strings and it doesn't buy much there. Because String and &str are not null-terminated, a "borrow" would allocate memory, as in to_glib_none(), which is pretty weird. At this point, I find it better to have the more explicit clone_to_foreign() that tells you how expensive it is. - consumption, which does not exist in glib-rs: fn into_foreign(self) -> Stash<'static, <Self as CloneToForeign>::Foreign, Self>; This works for types that, like String, deref to a stable address even when moved---this way we don't need to deal with pinning. And it is in fact especially nice for the String case and therefore for anything that uses format!(), because it could be implemented as s.reserve_exact(1); s.push('\0'); The two would be separate traits. There was one case in which I could have used into_foreign(), namely the call to error_setg_internal() in patch 5. But I am not sure if this will be more common than functions taking &str, so I left it for later. > Because much of our code is using GLib types and API style, I > think we should strive for something that is close (if not just the > same) to what glib-rs offers. It's already hard enough to handle > one binding concept, having 2 will only make the matter worse. > Consider a type like GHashTable<GUuid, QOM>, it will be very > annoying to deal with if we have different bindings traits and > implementations and we will likely end up duplicating glib-rs > effort. I'm not even sure if it's even true that much QEMU code is using GLib types. We use quite a bit of GLib, and took a lot of inspiration from it because GLib is good. But it's notable that we use very little GLib *across modules*. There are only a handful of occurrences to GList/GList/GHashTable in function arguments in include/, for all of QEMU, and all of them are very rarely used functions. Why? Possibly because, even where we take inspiration (the Error** idiom, QOM casts), the implementation ends up completely different for reasons that aren't just NIH. In any case I don't see us using glib-rs much, if at all; uses of hash tables in Rust code can just use HashMap<>. Yes, there is no equivalent of the expensive, O(n) conversion self.some_g_hash_table.to_glib_none(); but that's a feature, not a limitation. Also, anyway we couldn't extend the glib traits a lot, and glib-rs only supports HashMap<String, String>. > As for naming & consistency, glib-rs settled on something clearer imho: > > from_glib_full > from_glib_none > to_glib_full > to_glib_none > > vs > > from_foreign > cloned_from_foreign > clone_to_foreign > /nothing/ These last two are respectively clone_to_foreign_ptr and clone_to_foreign. I am not wed to my names but I'm not sure the glib ones are clearer. What I want from names is clarity on who allocates or frees; I do that with "clone", which is a common Rust term, to indicate that the C side still needs to be freed after the call. I find it confusing that from_glib_full() requires no freeing but to_glib_full() requires freeing, for example - I understand _why_ but still "full" and "none" are GIR terms that are not really familiar to most Rust or QEMU developers. And secondarily, clarity on what is cheap and what isn't. I don't have that functionality at all for now, but the nomenclature is easily extended to borrow_foreign() and into_foreign(). Talking about Rust terms, glib-rs lacks "into" which is the standard Rust way to do consuming conversions, while my code has into_native(). All in all, I think there are some significant differences between glib-rs and this implementation that justify the effort. Paolo