Thanks John, another one for my list of things to look at. On Mon, Aug 7, 2017 at 11:24 PM, John Zhuge <john.zh...@gmail.com> wrote:
> And check out HADOOP-12145 > <https://issues.apache.org/jira/browse/HADOOP-12145> Organize and update > CodeReviewChecklist wiki. > > Thanks, your contribution will be greatly appreciated! > > > On Mon, Aug 7, 2017 at 5:53 AM, Steve Loughran <ste...@hortonworks.com> > wrote: > >> >> Hi Lars & Welcome! >> >> Maybe the first step here would be look at those style guides and think >> how to bring them up to date, especially with stuff like lambda-expressions >> in java 8, and mnodules forthcoming in in java 9, SLF4J logging, Junit 5 -> >> 5 testing, code instrumentation, diagnostics, log stability, etc. >> >> https://issues.apache.org/jira/browse/HADOOP-12143 . ; >> >> This is my go at doing this >> >> https://github.com/steveloughran/formality/blob/master/ >> styleguide/styleguide.md >> >> >> I've not done any work on trying to get it in, more evolving it as how I >> code & what I look for, especially in tests. >> >> If you want to take this on, it'd be nice. At the same time, I fear >> there'd be push back if you turned up and started telling people what to >> do. Collaborating with us all on the test code is a good place to start. >> >> We're also more relaxed about contributions to the less-core bits of the >> system (things like HDFS, IPC, security and Yarn core are trouble). If >> there's stuff outside that you want to take a go at helping clean up, >> that'd be lower risk (example: object store connectors) >> >> -Steve >> >> >> >> On 7 Aug 2017, at 13:13, Lars Francke <lars.fran...@gmail.com<mailto: >> lars.fran...@gmail.com>> wrote: >> >> Hi, >> >> a few words about me: I've contributed to Hadoop (and it's ecosystem[4]) >> in >> the past am a Hive committer and have used Hadoop for 10 years now, so I'm >> not totally inexperienced. I'm earning my money as a Hadoop consultant so >> I've seen dozens of real-life clusters in my life. >> >> As part of a few recent client projects and now writing about Hadoop in a >> new project/book I'm digging into the source code to figure out some of >> the >> things that are not documented. >> >> But as part of this digging I'm seeing lots of warnings in the code, >> inconsistencies etc. and I'd like to contribute some fixes to this back to >> the community. >> >> I have been a long-time believer in good code quality and consistent code >> styles. This might affect people like me especially who do a lot of >> "drive-by" contributions as I'm not someone who looks at the code daily >> but >> comes across it reasonably often as part of client engagements. In those >> scenarios, it's very unhelpful to have inconsistent code & bad >> documentation. >> >> Two simple but concrete examples: >> * There's lots of "final" usages on variables and methods but no >> consistency. Was this done for particular reasons or personal preference? >> >> personal, though with a move to l-expressions, it matters a lot more. We >> should really be marking all parameters as final at the very least. >> >> >> * Similarly, there's lots of things that are public or protected while >> they >> could in theory be private. This especially makes it very hard to reason >> about code. >> >> there's now a bit of fear of breaking things, but at the very least, >> things could be protected or package-private more than they are. >> >> >> >> Judging from the current code there's lots of "unofficial" code styling >> and/or personal preference. The Wiki says[1] to follow the Sun >> guidelines[2] which have not been updated in almost 20 years. A new >> version >> is in the works an clarifies a lot of things[3]. I'm trying to get it >> published soon. I'd try to format according to the latter (that means >> among >> other things no "final" for local variables). >> >> I realize that I won't be able to single-handedly fix all of this >> especially as code gets contributed but if the community thinks it's >> worthwhile I'd still love to land a few cleanup patches. My experience in >> the past has been that it's hard to get attention to these things (which I >> fully understand as they take up someone's time to review & commit). >> >> So, this is my request for comments on these questions: >> * Is there any interest in this at all? >> ** "This" being patches for code style & things like FindBugs & Checkstyle >> warnings >> * Size of the patches: Rather one big patch or smaller ones (e.g. per file >> or package) >> * Anyone willing to help me with this? e.g. reviewing and committing? I'd >> be more than happy to bribe you with drinks, sweets, food or something >> else >> >> My plan is not to go through each and every file and fix every issue I >> see. >> But there are some specific areas I'm looking at in detail and there I'd >> love to contribute back. >> >> Thank you for reading! >> >> Cheers, >> Lars >> >> PS: Posting to common-dev only, not sure if I should cross post to >> hdfs-dev >> and yarn-dev as well? >> >> [1] <https://wiki.apache.org/hadoop/CodeReviewChecklist> >> [2] < >> http://www.oracle.com/technetwork/java/javase/documentation/ >> codeconvtoc-136057.html >> >> [3] <http://cr.openjdk.java.net/~alundblad/styleguide/index-v6.html> >> [4] < >> https://issues.apache.org/jira/issues/?filter=-1&jql=reporte >> r%20%3D%20lars_francke%20OR%20assignee%20%3D%20lars_francke >> >> >> > > > -- > John >