[[ cc list trimmed to just qemu-devel ]] On Fri, Jan 27, 2023 at 8:18 AM Stefan Hajnoczi <stefa...@gmail.com> wrote:
> Dear QEMU, KVM, and rust-vmm communities, > QEMU will apply for Google Summer of Code 2023 > (https://summerofcode.withgoogle.com/) and has been accepted into > Outreachy May 2023 (https://www.outreachy.org/). You can now > submit internship project ideas for QEMU, KVM, and rust-vmm! > > Please reply to this email by February 6th with your project ideas. > > If you have experience contributing to QEMU, KVM, or rust-vmm you can > be a mentor. Mentors support interns as they work on their project. It's a > great way to give back and you get to work with people who are just > starting out in open source. > > Good project ideas are suitable for remote work by a competent > programmer who is not yet familiar with the codebase. In > addition, they are: > - Well-defined - the scope is clear > - Self-contained - there are few dependencies > - Uncontroversial - they are acceptable to the community > - Incremental - they produce deliverables along the way > > Feel free to post ideas even if you are unable to mentor the project. > It doesn't hurt to share the idea! > I've been a GSoC mentor for the FreeBSD project on and off for maybe 10-15 years now. I thought I'd share this for feedback here. My project idea falls between the two projects. I've been trying to get bsd-user reviewed and upstreamed for some time now and my time available to do the upstreaming has been greatly diminished lately. It got me thinking: upstreaming is more than just getting patches reviewed often times. While there is a rather mechanical aspect to it (and I could likely automate that aspect more), the real value of going through the review process is that it points out things that had been done wrong, things that need to be redone or refactored, etc. It's often these suggestions that lead to the biggest investment of time on my part: Is this idea good? if I do it, does it break things? Is the feedback right about what's wrong, but wrong about how to fix it? etc. Plus the inevitable, I thought this was a good idea, implemented it only to find it broke other things, and how do I explain that and provide feedback to the reviewer about that breakage to see if it is worth pursuing further or not? So my idea for a project is two fold: First, to create scripts to automate the upstreaming process: to break big files into bite-sized chunks for review on this list. git publish does a great job from there. The current backlog to upstream is approximately " 175 files changed, 30270 insertions(+), 640 deletions(-)" which is 300-600 patches at the 50-100 line patch guidance I've been given. So even at .1hr (6 minutes) per patch (which is about 3x faster than I can do it by hand), that's ~60 hours just to create the patches. Writing automation should take much less time. Realistically, this is on the order of 10-20 hours to get done. Second, it's to take feedback from the reviews for refactoring the bsd-user code base (which will eventually land in upstream). I often spend a few hours creating my patches each quarter, then about 10 or so hours for the 30ish patches that I do processing the review feedback by refactoring other things (typically other architectures), checking details of other architectures (usually by looking at the FreeBSD kernel), or looking for ways to refactor to share code with linux-user (though so far only the safe signals is upstream: elf could be too), or chatting online about the feedback to better understand it, to see what I can mine from linux-user (since the code is derived from that, but didn't pick up all the changes linus-user has), etc. This would be on the order of 100 hours. Third, the testing infrastructure that exists for linux-user is not well leveraged to test bsd-user. I've done some tests from time to time with it, but it's not in a state that it can be used as, say, part of a CI pipeline. In addition, the FreeBSD project has some very large jobs, a subset of which could be used to further ensure that critical bits of infrastructure don't break (or are working if not in a CI pipeline). Things like building and using go, rust and the like are constantly breaking for reasons too long to enumerate here. This job could be as little as 50 hours to do a minimal but complete enough for CI job, or as much as 200 hours to do a more complete jobs that could be used to bisect breakage more quickly and give good assurance that at any given time bsd-user is useful and working. That's in addition to growing the number of people that can work on this code and on the *-user code in general since they are quite similar. Some of these tasks are squarely in the qemu-realm, while others are in the FreeBSD realm, but that's similar to linux-user which requires very heavy interfacing with the linux realm. It's just that a lot of that work is already complete so the needs are substantially less there on an ongoing basis. Since it does stratal the two projects, I'm unsure where to propose this project be housed. But since this is a call for ideas, I thought I'd float it to see what the feedback is. I'm happy to write this up in a more formal sense if it would be seriously considered, but want to get feedback as to what areas I might want to emphasize in such a proposal. Comments? Warner I will review project ideas and keep you up-to-date on QEMU's > acceptance into GSoC. > > Internship program details: > - Paid, remote work open source internships > - GSoC projects are 175 or 350 hours, Outreachy projects are 30 > hrs/week for 12 weeks > - Mentored by volunteers from QEMU, KVM, and rust-vmm > - Mentors typically spend at least 5 hours per week during the coding > period > > For more background on QEMU internships, check out this video: > https://www.youtube.com/watch?v=xNVCX7YMUL8 > > Please let me know if you have any questions! > > Stefan > >