I like this idea! It would make it easier to follow why these decisions were made. But if we are to adopt this idea, we might need to decide which decisions we want to include in these ADRs. Or maybe we could include a checkbox on the PR template for the committers to decide whether to ask the author to add an ADR before merging.
On the other hand, I also like Niko's suggestion a lot. Some basic annotation/metadata makes it easier to trace the log, and git blame is really handy. It is easier to check compared to the doc. And the commit rule doesn't need to be complicated. But yes, this kind of rule would definitely be a barrier to new contributors, which will be something we need to consider as well. Best, Wei > On Dec 5, 2023, at 6:20 AM, Jarek Potiuk <ja...@potiuk.com> wrote: > >> >> I love this idea! >> >> Another option, that I don't think we as a community are very good at, is >> putting the context of the change in the git commit message itself. Those >> messages are already tightly associated into git history and the code >> itself via blame without needing to introduce an new concept for this >> purpose. Those commit messages can be viewed with many existing tools that >> can browse Git blame/log. Some projects like Linux or Git itself have very >> large commit messages with all the context required, a random example I >> pulled from the first page of most recent commits: >> https://github.com/torvalds/linux/commit/d67f39d2b81b6a8259944d2400c1ff4fe283ff72 > > > Oh absolutely - if the single commit is a "complete" change - I am all for > long messages. > Take a look at some of my commit messages :D. > > I'd love some of our commit messages that are detailed and describe the > context of the PRs. Especially in cases where it affects more than just > obvious "Adding this or that new operator" for example. > > I think that we do not also need to capture all decisions this way. For me > this is really a question of having something in between a substantial > commit (that needs a long message) and AIP. Something that is a "shared" > and "common" piece of infrastructure that introduces a new "idea" or new > "common utility" or "concept" to be followed by others in the future. > > The "common.sql" and "serde" are IMHO good examples of such "concepts". > The consequences of decisions made there will be carried over in multiple > other commits, and it's great to have it written and recorded in the repo > as a "markdown" file - so that we do not have to find the original commit > that introduced it. It is right there, where you need it - in the folder > where the source code lives that it relates to. > > I am not a big fan of having "one rule" for all commits and changes, I > really think we should have a gradation: > > 1) simple fix or adding new operator/hook that needs no extra context -> > fine with single line commit > 2) bigger fix that needs more explanation why we are doing it/context that > will not be obvious after some time -> long commit message explaining why > we are doing it, context, problem it solves, etc. > 3) a concept that comes and gets refined in probably multiple PRs > introducing a "shared" way of doing things that we want others to follow -> > ADR describing the concept, decisions, alternatives, consequences (and can > evolve over time by adding more records). > 4) big architectural change which requires a lot of deliberation and voting > and has wide impact on the architecture of Airflow -> AIP > > I think we do not really want to introduce a lot of bureaucracy, we should > only add more "documentation" expectations where it really matters to keep > it for future maintainers (and future selves as well as I often find those > kind of notes and commit messages very useful months and years from the > time they were made. > > BTW. Something that made my day recently "If bureaucracy was meant to be > easy, they'd have made the word easier to spell". > > J. > > > >> >> >> You can see the git commit messages is MUCH longer than the code change >> itself even! So if you're curious why that code is the way it is, you can >> just git blame it and have all the context there. >> >> The ADR having markdown is nice, it allows you a bit more formatting, but >> then it also requires a couple more steps to view that formatting. >> >> Cheers, >> Niko >> >> ________________________________ >> From: Vincent Beck <vincb...@apache.org> >> Sent: Monday, December 4, 2023 11:22:54 AM >> To: dev@airflow.apache.org >> Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Capturing >> Architectural decisions (ADRS?) >> >> CAUTION: This email originated from outside of the organization. Do not >> click links or open attachments unless you can confirm the sender and know >> the content is safe. >> >> >> >> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. >> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez >> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que >> le contenu ne présente aucun risque. >> >> >> >> I love this idea. I definitely think it can improve a lot the knowledge >> sharing across Airflow. Given the history and the number of components in >> Airflow, it is hard to keep up with everything, so having these ADRs would >> help a lot I think! >> >> On 2023/12/03 23:57:11 Jarek Potiuk wrote: >>> Hey everyone, >>> >>> I think we had a bit of clash in >>> https://github.com/apache/airflow/pull/32319 where both "ideas" >>> (serialization and common.sql) had not been sufficiently >>> discussed/explained and I hope we can address it by adding (a bit) more >>> "whys" to our (developer) documentation. >>> >>> I think a number of our past decisions and reasoning behind them are >> often >>> staying in the heads of the people who were discussing them, and even if >> it >>> is captured in past discussions, PRs, it's difficult to do "archeology" >> on >>> them and re-process them and understand what we wanted to achieve and >> why. >>> Some of those are big enough to have impact on future PRs etc. while not >>> big enough to get to Airflow Improvement Proposals and I think we miss >>> a bit of persistent "decision records" for them. >>> >>> Two cases in question: Serialization and Common.sql API - both of which >>> have not been understood well by people involved in one, but not the >> other >>> in the past. >>> >>> With the "common.sql" PR (https://github.com/apache/airflow/pull/36015) >> - >>> my proposal is to add it in the form of ADR ("Architecture Decision >>> Records' - which is a very simple and lightweight way of storing the >>> decisions we made - and evolving them. >>> >>> ADRs are pretty popular and adopted in mature organisations/projects and >>> I've used them in `breeze` >>> https://github.com/apache/airflow/tree/main/dev/breeze/doc/adr and I >> think >>> they are perfect for capturing, context, decisions and putting down the >>> consequences of some decisions. >>> >>> They are usually kept close to the code the decision is about, they are >>> usually short and describe a single aspect of architectural decision, and >>> they are aimed to be read whenever in the future, people who were not >>> involved in those decisions can easily recover why the decisions are made >>> and what are the consequences of it. >>> >>> I am not saying - of course - we should do it for all or even most >> changes >>> - I am talking about decisions that have potential impact on others - in >>> the future. I.e. when we tell (this is how our approach should look in >> the >>> future for "general" behaviour. >>> >>> Both - serialization and common.sql are good examples of such decisions - >>> that I believe deserve to be captured "why" we are doing them and what we >>> wanted to achieve. >>> >>> WDYT? >>> >>> J. >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> For additional commands, e-mail: dev-h...@airflow.apache.org