On Mon, Sep 28, 2020 at 10:28 PM Bruno Haible <br...@clisp.org> wrote: > > It's a pity that grep-2.5 was released with such a mistake. > ... > How can we avoid such things?
Just my 2-cents, but I think the technical controls are good. You have a CI pipeline, and you are testing on multiple platforms. You are also asking the community to test on their favorite hardware and platforms during development and before a release. You may be able to increase platform coverage, but I don't think that is the issue. I think the next improvement to the release process should engineer around human behavior. Here, the human behavior is, most folks avoid pre-release testing, and then jump right to the release. Then they find there are small issues that made it into the release. Here's how several projects I work with handle it. 1. CI pipelines to provide assurances on Master 2. Prior to release, create a RC 3. Call for testers 4. Accept feedback on the RC 5. Fix issues, back to (2) Once you are happy with the state of the code, release. In the case of Grep, say you release Grep 3.5 6. Release the updated software Now here are the new steps to consider: 7. Accept feedback on the release 8. Fix issues 9. Release a minor version at T+30 days Step (7) is the standard feedback loop in a software development lifecycle. You have two feedback cycles: one at Step (4), and one at Step (7). That's OK. It is simply stating what you already do - you fix issues. At step (9), you simply release Grep 3.5.1 at T+30 days if there are bugs in Grep 3.5 that are generating mailing list messages and bug reports. The mailing list messages and bug reports will be recurring because they happen in a release tarball. Steps (7) - (9) take into account that many folks will not perform pre-release testing. But they will update and then report problems back to the project. It has been my experience that once you perform the minor version release, like Grep 3.5.1, that's it, you're done. You won't have to worry about it again until Grep 3.6 is released. And what really happened is, you used an administrative control to engineer around human behavior. You did not add any new technical controls. Jeff