On Fri, Jan 27, 2023 at 3:02 PM Stefan Hajnoczi <stefa...@gmail.com> wrote:
> On Fri, 27 Jan 2023 at 12:10, Warner Losh <i...@bsdimp.com> wrote: > > > > [[ cc list trimmed to just qemu-devel ]] > > > > On Fri, Jan 27, 2023 at 8:18 AM Stefan Hajnoczi <stefa...@gmail.com> > wrote: > >> > >> Dear QEMU, KVM, and rust-vmm communities, > >> QEMU will apply for Google Summer of Code 2023 > >> (https://summerofcode.withgoogle.com/) and has been accepted into > >> Outreachy May 2023 (https://www.outreachy.org/). You can now > >> submit internship project ideas for QEMU, KVM, and rust-vmm! > >> > >> Please reply to this email by February 6th with your project ideas. > >> > >> If you have experience contributing to QEMU, KVM, or rust-vmm you can > >> be a mentor. Mentors support interns as they work on their project. > It's a > >> great way to give back and you get to work with people who are just > >> starting out in open source. > >> > >> Good project ideas are suitable for remote work by a competent > >> programmer who is not yet familiar with the codebase. In > >> addition, they are: > >> - Well-defined - the scope is clear > >> - Self-contained - there are few dependencies > >> - Uncontroversial - they are acceptable to the community > >> - Incremental - they produce deliverables along the way > >> > >> Feel free to post ideas even if you are unable to mentor the project. > >> It doesn't hurt to share the idea! > > > > > > I've been a GSoC mentor for the FreeBSD project on and off for maybe > > 10-15 years now. I thought I'd share this for feedback here. > > > > My project idea falls between the two projects. I've been trying > > to get bsd-user reviewed and upstreamed for some time now and my > > time available to do the upstreaming has been greatly diminished lately. > > It got me thinking: upstreaming is more than just getting patches > reviewed > > often times. While there is a rather mechanical aspect to it (and I > could likely > > automate that aspect more), the real value of going through the review > process > > is that it points out things that had been done wrong, things that need > to be > > redone or refactored, etc. It's often these suggestions that lead to the > biggest > > investment of time on my part: Is this idea good? if I do it, does it > break things? > > Is the feedback right about what's wrong, but wrong about how to fix it? > etc. > > Plus the inevitable, I thought this was a good idea, implemented it only > to find > > it broke other things, and how do I explain that and provide feedback to > the > > reviewer about that breakage to see if it is worth pursuing further or > not? > > > > So my idea for a project is two fold: First, to create scripts to > automate the > > upstreaming process: to break big files into bite-sized chunks for > review on > > this list. git publish does a great job from there. The current backlog > to upstream > > is approximately " 175 files changed, 30270 insertions(+), 640 > deletions(-)" which > > is 300-600 patches at the 50-100 line patch guidance I've been given. So > even > > at .1hr (6 minutes) per patch (which is about 3x faster than I can do it > by hand), > > that's ~60 hours just to create the patches. Writing automation should > take > > much less time. Realistically, this is on the order of 10-20 hours to > get done. > > > > Second, it's to take feedback from the reviews for refactoring > > the bsd-user code base (which will eventually land in upstream). I often > spend > > a few hours creating my patches each quarter, then about 10 or so hours > for the > > 30ish patches that I do processing the review feedback by refactoring > other things > > (typically other architectures), checking details of other architectures > (usually by > > looking at the FreeBSD kernel), or looking for ways to refactor to share > code with > > linux-user (though so far only the safe signals is upstream: elf could > be too), or > > chatting online about the feedback to better understand it, to see what > I can mine > > from linux-user (since the code is derived from that, but didn't pick up > all the changes > > linus-user has), etc. This would be on the order of 100 hours. > > > > Third, the testing infrastructure that exists for linux-user is not well > leveraged to test > > bsd-user. I've done some tests from time to time with it, but it's not > in a state that it > > can be used as, say, part of a CI pipeline. In addition, the FreeBSD > project has some > > very large jobs, a subset of which could be used to further ensure that > critical bits of > > infrastructure don't break (or are working if not in a CI pipeline). > Things like building > > and using go, rust and the like are constantly breaking for reasons too > long to enumerate > > here. This job could be as little as 50 hours to do a minimal but > complete enough for CI job, > > or as much as 200 hours to do a more complete jobs that could be used to > bisect breakage > > more quickly and give good assurance that at any given time bsd-user is > useful and working. > > > > That's in addition to growing the number of people that can work on this > code and > > on the *-user code in general since they are quite similar. > > > > Some of these tasks are squarely in the qemu-realm, while others are in > the FreeBSD realm, > > but that's similar to linux-user which requires very heavy interfacing > with the linux realm. It's > > just that a lot of that work is already complete so the needs are > substantially less there on an > > ongoing basis. Since it does stratal the two projects, I'm unsure where > to propose this project > > be housed. But since this is a call for ideas, I thought I'd float it to > see what the feedback is. I'm > > happy to write this up in a more formal sense if it would be seriously > considered, but want to get > > feedback as to what areas I might want to emphasize in such a proposal. > > > > Comments? > > Hi Warner, > Don't worry about it spanning FreeBSD and QEMU, you're welcome to list > the project idea through QEMU. You can have co-mentors that are not > part of the QEMU community in order to bring in additional FreeBSD > expertise. > > My main thought is that getting all code upstream sounds like a > sprawling project that likely won't be finished within one internship. > Can you pick just a subset of what you described? It should be a > well-defined project that depends minimally on other people finishing > stuff or reaching agreement on something controversial? That way the > intern will be able to come up with specific tasks for their project > plan and there is little risk that they can't complete them due to > outside factors. > I like this notion of limiting the scope. There's three or maybe four main areas that I can call out. I got to thinking about all the details I have to do for how I've been upstreaming things, and realized that there's a lot due to the complicated history here... One way to go about this might be for you to define a milestone that > involves completing, testing, and upstreaming just a subset of the > out-of-tree code. For example, it might implement a limited set of > core syscall families. The intern will then focus on delivering that > instead of worrying about the daunting task of getting everything > merged. Finishing this subset would advance bsd-user FreeBSD support > by a useful degree (e.g. ability to run certain applications). > > Does that sound good? > Yes. I like this, but it's hard to know what that might be because many things are hidden behind the scenes... But I'll try running a quick build to see if I can gather enough stats to come up with a good set of tests... But maybe I'll start with building 'hello world' with clang on armv7 running on an amd64 host to see what's missing today. I also have an aarch64 set of patches I might try hard to get in ASAP so that might be the target instead (since it might be a bit more useful). Warner > Stefan >