Thanks for the details!

To be honest, I still prefer the "light build" approach with gradle,
because it's pretty easy for contributors to check license headers in
their contributed file (as with gradle plugin, it will be included in
the check phase).
I think it's good to have it in the regular "local" contributor build
instead of need of docker/workflow execution.

Just my $0.01

Regards
JB

On Fri, Oct 27, 2023 at 5:16 PM Xuanwo <xua...@apache.org> wrote:
>
> Here are some quick notes for skywalking-eyes, hoping them will be helpful.
>
> Before using skywalking-eyes, we need to setup config as said in [1]. Take 
> iceberg-rust as an example [2].
>
> For checking in CI:
>
> Adding following content in workflow [3]
>
> - name: Check License Header
>   uses: apache/skywalking-eyes/header@v0.5.0
>
> For local usage:
>
> docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header 
> check
> docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header 
> fix
>
> [1]: 
> https://github.com/apache/skywalking-eyes?tab=readme-ov-file#configurations
> [2]: https://github.com/apache/iceberg-rust/blob/main/.licenserc.yaml
> [3]: 
> https://github.com/apache/iceberg-rust/blob/94a1c5d7742bc3b2a9ac7c8da20711a5e2578b89/.github/workflows/ci.yml#L38C1-L39C51
>
> On Fri, Oct 27, 2023, at 22:17, Jean-Baptiste Onofré wrote:
> > Thanks for the heads up Xuanwo.
> >
> > It's the fourth option :) I will make a comparison with RAT.
> >
> > Regards
> > JB
> >
> > On Fri, Oct 27, 2023 at 12:15 PM Xuanwo <xua...@apache.org> wrote:
> >>
> >> iceberg-rust is using apache/skywalking-eyes/header@v0.5.0 now.
> >>
> >> BTW, we found skywalking-eyes works really well. It's fast, correct and 
> >> well-maintained.
> >>
> >> Maybe worth take a look.
> >>
> >> On Fri, Oct 27, 2023, at 17:48, Jean-Baptiste Onofré wrote:
> >> > By the way, as dev/check-license is also used in iceberg-python and
> >> > iceberg-go repositories (iceberg-rust doesn't have it), maybe I can
> >> > move forward on new rat release with the fix on hidden directories and
> >> > update there as well.
> >> >
> >> > Regards
> >> > JB
> >> >
> >> > On Thu, Oct 26, 2023 at 5:19 PM Jean-Baptiste Onofré <j...@nanthrax.net> 
> >> > wrote:
> >> >>
> >> >> Hi guys,
> >> >>
> >> >> During the 1.4.1 vote, we identified some files without ASF headers,
> >> >> more specifically in hidden directories (like .baseline). These files
> >> >> have not been detected by dev/check-license script.
> >> >>
> >> >> The reason is because the script uses apache-rat via java -jar (the
> >> >> Apache RAT CLI), executing the RAT Report class ()
> >> >>
> >> >> When providing a directory to scan, the Report class uses a
> >> >> DirectoryWalker to traverse the directories, looking for files to
> >> >> check.
> >> >> Unfortunately, by default, the DirectoryWalker ignores the hidden
> >> >> directory (all directories starting with .):
> >> >>
> >> >> https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/walker/DirectoryWalker.java#L71
> >> >>
> >> >> https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/walker/Walker.java#L53
> >> >>
> >> >> In our case, it means that it ignores .baseline, .github, .git,
> >> >> .palantir directories. This is not good as .baseline, .github and
> >> >> .palantir directories are included in our source distribution, so the
> >> >> ASF headers should be clean here.
> >> >>
> >> >> FYI, I will propose a change in rat to, at least, be able to use a
> >> >> ReportConfiguration to define if we want to restrict directories or
> >> >> not (be able to configure the Walkers basically). I will try to
> >> >> include it in rat 0.17 (I discussed with Claude about that).
> >> >> NB: rat gradle and maven plugins define their own Walkers/IReportable
> >> >> to avoid this issue.
> >> >>
> >> >> I propose three options to improve this:
> >> >> 1. We keep dev/check-license as it is today (using rat 0.15) and we
> >> >> exclude .baseline, .github, .palantir directories from our source
> >> >> distribution. It means that we will probably have issues while
> >> >> building from source distribution. We can revisit dev/check-license
> >> >> when we upgrade to rat 0.17
> >> >>
> >> >> 2. We keep dev/check-license but we create our own rat scanner with a
> >> >> custom IReportable class considering all files/directories. Something
> >> >> like:
> >> >>
> >> >>     public static void main(String[] args) throws Exception {
> >> >>         ReportConfiguration reportConfiguration = new 
> >> >> ReportConfiguration();
> >> >>         
> >> >> reportConfiguration.setHeaderMatcher(Defaults.createDefaultMatcher());
> >> >>
> >> >>         XmlWriter writer = new XmlWriter(new FileWriter("report.xml"));
> >> >>         ClaimStatistic claimStatistic = new ClaimStatistic();
> >> >>         RatReport ratReport = XmlReportFactory.createStandardReport(
> >> >>                 writer,
> >> >>                 claimStatistic,
> >> >>                 reportConfiguration
> >> >>         );
> >> >>         ratReport.startReport();
> >> >>         IncludeHiddenDirectoryWalker walker = new
> >> >> IncludeHiddenDirectoryWalker(new File("/path/to/iceberg"));
> >> >>         walker.run(ratReport);
> >> >>         ratReport.endReport();
> >> >>         writer.closeDocument();
> >> >>     }
> >> >>
> >> >> public class IncludeHiddenDirectoryWalker implements IReportable {
> >> >>
> >> >>
> >> >>     private File file;
> >> >>
> >> >>     public IncludeHiddenDirectoryWalker(File file) {
> >> >>         this.file = file;
> >> >>     }
> >> >>
> >> >>     @Override
> >> >>     public void run(RatReport report) throws RatException {
> >> >>         process(report, file);
> >> >>     }
> >> >>
> >> >>     public void process(RatReport report, File file) {
> >> >>         final File[] files = file.listFiles();
> >> >>         if (files != null) {
> >> >>             for (File current : files) {
> >> >>                 if (current.isDirectory()) {
> >> >>                     process(report, current);
> >> >>                 } else {
> >> >>                     try {
> >> >>                         Document document = new FileDocument(current);
> >> >>                         report.report(document);
> >> >>                     } catch (RatException e) {
> >> >>                         System.err.println("Can't report file " +
> >> >> current.getAbsolutePath() + ": " + e);
> >> >>                     }
> >> >>                 }
> >> >>             }
> >> >>         }
> >> >>     }
> >> >>
> >> >> }
> >> >>
> >> >> So, I can contribute this in dev/src for example.
> >> >>
> >> >> 3. Instead of using dev/check-license, we can use the rat gradle
> >> >> plugin (https://github.com/eskatos/creadur-rat-gradle/). I tested it
> >> >> and it works fine as it uses a custom IReportable like:
> >> >>
> >> >> https://github.com/eskatos/creadur-rat-gradle/blob/master/src/main/kotlin/org/nosphere/apache/rat/RatWork.kt#L135
> >> >>
> >> >> https://github.com/eskatos/creadur-rat-gradle/blob/master/src/main/kotlin/org/nosphere/apache/rat/RatWork.kt#L189
> >> >>
> >> >> We can include this plugin the check gradle phase, meaning that we can
> >> >> verify headers for each PR.
> >> >>
> >> >> My preference would be for 3, mainly because:
> >> >> 1. it integrates smoothly in our gradle ecosystem, adding a new plugin
> >> >> as we have gradle-baseline-java, gradle-errorprone-plugin,
> >> >> spotless-plugin-gradle, etc
> >> >> 2. As we can hook rat gradle plugin in the gradle check task, it means
> >> >> license check will be perform at build time, including check on PR by
> >> >> GitHub Actions
> >> >>
> >> >> If you are OK with 3, I will work on:
> >> >> 1. a PR to use it
> >> >> 2. a PR for website to update release check procedure
> >> >>
> >> >> Thoughts ?
> >> >>
> >> >> Regards
> >> >> JB
> >>
> >> --
> >> Xuanwo
>
> --
> Xuanwo

Reply via email to