Sorry for the late reply to this ... On Tue, Apr 30, 2019 at 06:28:01PM +0200, Pino Toscano wrote: > On Friday, 9 February 2018 19:01:53 CEST Richard W.M. Jones wrote: > > My contention is that the libguestfs git repository is too large and > > unwieldy. There are too many separate, unrelated projects and as a > > result of that the source has too many dependencies and takes too long > > to build and test. > > > > The project divides (sort of) naturally into layers -- the library, > > the bindings, the various virt tools -- and could be split along those > > lines into separate projects which can then be released and evolve at > > their own pace. > > As also other answers to this email say, splitting tools, and bindings > may be very complex, and thus for now it is still a too far goal. > > However... > > > My suggested split would be something like this: > > > > [...] > > virt-v2v and virt-p2v > > I'd rather split virt-p2v in its own repository. There are various > reasons for this: > - it does not use libguestfs (the library), just the tools for testing > stuff > - the communication with virt-v2v is done via network, and its > capabilities are dynamically probed (so theoretically virt-p2v, and > virt-v2v can be used even when their versions are odd) > - it is written only in C > > However, even if it looks simple, in reality there are number of common > things used from the rest of the libguestfs tree: > 1) gnulib
We hardly use gnulib in virt-p2v. I think it's only used for ignore-value.h, getprogname.h, and c-ctype.h, all of which are likely to be easily worked around. > 2) some build system bits (e.g. m4/guestfs-v2v.m4) Right, although this in itself should be split up, so no bad thing. > 3) auto-cleanup bits (e.g. CLEANUP_FREE), although only few are used > (CLEANUP_FREE, CLEANUP_FREE_STRING_LIST, CLEANUP_PCLOSE, > CLEANUP_FCLOSE, and CLEANUP_XMLFREETEXTWRITER) > 4) other internal macros, i.e. guestfs-utils.h Common code is a bit tricker, as is ... > 5) the list of credits generated by the generator > (i.e. generator/authors.ml) > 6) the p2v configuration generated by the generator > (i.e. generator/p2v_config.ml) ... the generator and ... > 7) test images/data (phony images, and virt-tools) test data. > 8) the miniexpect module, right now out of the p2v subdirectory This is only used by virt-p2v I think, so it could go with virt-p2v or be made into a separate project. > Possible solutions may/might be: > 1) add own submodule (use its own set of modules) I think we should ditch gnulib as much as possible, so see above. > 2) copy/implement them them locally: luckly they are not many, so > inlining them in configure.ac will not be a problem; the common > bits (e.g. the distro detection from os-release) can be split in > its own module in libguestfs, copying it in p2v > 3/4) have a local version of them; not pretty, although they are not > that many > 5) this list is reflected in two places: the p2v/about-authors.c file, > and the AUTHORS file (theoretically mandatory for automake, unless > "foreign" is used, which it is); my idea was to go back to a manually > written about-authors.c file without the libguestfs credits, leaving > the few p2v ones easy to manage; the same for the AUTHORS file > 6) this is a bit more complex: my idea was to keep it as OCaml script > to run at build time, instead of being statically shipped at dist > time > 7) create their own versions at test time using guestfish/virt-builder; > maybe use a fedora image, instead of a phony windows one (will avoid > hivex for the tests) > 8) So while I'm not a massive fan of git submodules, now that I have used them a few times with riscv stuff, they do solve a certain problem as long as they are managed carefully. I think the common code and the generator are cases where a submodule or two would work. Does this mean we need to move immediately to a submodule if just splitting virt-p2v, or copy code as you suggest? Maybe not, because you can imagine for just this project copying the code needed from the common/ directory, and creating a new "mini-generator" for the project which handles the little bits that need to be generated in virt-p2v. However in the long term if we split up everything a submodule or two does seem to make sense, so maybe we should start there? > The other problem is how to split the repository, as the various bits > are in different places: > a) git filter-branch --subdirectory-filter p2v > + very small repo with the current p2v subdirectory > + preserves the history of the p2v subdirectory, with branches and tags > - missing all the other bits, which will have no history > - not usable to build older releases (e.g. for bisecting) I'm not exactly sure what this does. Is this something to do with preserving the history? TBH I don't think we need to bother with the history -- it exists still in libguestfs.git. > b) create a work branch in libguestfs, then in that branch move/copy all > the stuff making the p2v subdirectory build standalone there, and then > import the content of the p2v subdirectory of that branch in a new empty > repo > + very small repo with the current p2v subdirectory > - no history, no tags nor branches > + using a graft it is possible to "stitch" the history of the new repo > with the work branch in libguestfs > > c) git filter-branch to remove all the bits not related to p2v from all > the commits > + not that big repo > + preserves the history of all the content, with branches and tags > - will take a very long time to create (e.g. iterate over and over to > find out what to remove) > - not usable to build older releases (e.g. for bisecting) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://www.redhat.com/mailman/listinfo/libguestfs