+1 for Steve's and Chris's sentiments. Mass reformatting of existing code can make maintaining anything released prior to the makeover very difficult. Almost all of Apache Hadoop's users are not on trunk or branch-2, and I'm not sure we want large refactoring patches going into stability lines like branch-2.8, branch-2.7, and branch-2.6 where most of the users are. We should definitely consider the maintenance costs of refactoring decisions.
Jason On Wednesday, August 9, 2017 4:55 AM, Steve Loughran <ste...@hortonworks.com> wrote: > On 8 Aug 2017, at 21:33, Chris Douglas <cdoug...@apache.org> wrote: > > Lars- > > Welcome! > > As a mild refinement of enthusiasm for this proposal: when you > approach a "cleanup", please consider the cost to tracing the lineage > of changes in the codebase. Working on a project as large and > long-running as Hadoop, we often need to trace what motivated a > particular change using only the commit log and JIRA. Sifting through > cosmetic changes that obscure the reasoning behind a module not worth > the aesthetic benefits of consistently formatted code. As a strawman: > hitting 100% checkstyle compliance would not improve our users' > experience, so please use your judgement. > > As you point out, we're not going to maintain perfect discipline going > forward, either. Nitpicking our contributors beyond what is necessary > to keep the code legible discourages them from continuing as > contributors. As a general heuristic: the stricter the rule, the more > automation is required to enforce it. This prevents everyone from > burning out on minutiae. > > All that said, if you propose a refactoring that makes it easier to > maintain code that's developed more vestigial parts that functional > ones (and we have more than a few of those), that is hugely valuable. > -C That reminds me of a few more issues * Major cleanup patches invariably break handling pending patches from others (which we should review) and also make cherrypicking harder. Which we why we tend to avoid things like a "lets fix the import order" patch for the sake of it. * We can't treat things tagged @Private as stuff we can break on a whim. I know it'd be nice, but often they get picked up because they're the only way to do things...even the example YARN app does this. So changes there always need to go through a scan of the downstream apps. A few of us have IntelliJ set up to include all the main projects so we can find if/where a class or method gets used...and use that to temper our enthusiasm. Chris himself had to deal with this last week with the proposed removal of FileStatus.isDir in HADOOP-14726 . We never want to break downstream code FWIW, I'd use any new styleguide to manage future contribs, not reapply to all existing code, except during other work. Even then, with caution --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org