We already run RAT checks on every PR, so I'm not sure there's a lot of
value in moving the checks to gradle. That just means that we would need to
use a different framework across the implementations. If there's a way to
run license checks in CI that doesn't have the dot-file limitation, that
seems ideal to me.

On Fri, Oct 27, 2023 at 8:46 AM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Thanks for the details!
>
> To be honest, I still prefer the "light build" approach with gradle,
> because it's pretty easy for contributors to check license headers in
> their contributed file (as with gradle plugin, it will be included in
> the check phase).
> I think it's good to have it in the regular "local" contributor build
> instead of need of docker/workflow execution.
>
> Just my $0.01
>
> Regards
> JB
>
> On Fri, Oct 27, 2023 at 5:16 PM Xuanwo <xua...@apache.org> wrote:
> >
> > Here are some quick notes for skywalking-eyes, hoping them will be
> helpful.
> >
> > Before using skywalking-eyes, we need to setup config as said in [1].
> Take iceberg-rust as an example [2].
> >
> > For checking in CI:
> >
> > Adding following content in workflow [3]
> >
> > - name: Check License Header
> >   uses: apache/skywalking-eyes/header@v0.5.0
> >
> > For local usage:
> >
> > docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes
> header check
> > docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes
> header fix
> >
> > [1]:
> https://github.com/apache/skywalking-eyes?tab=readme-ov-file#configurations
> > [2]: https://github.com/apache/iceberg-rust/blob/main/.licenserc.yaml
> > [3]:
> https://github.com/apache/iceberg-rust/blob/94a1c5d7742bc3b2a9ac7c8da20711a5e2578b89/.github/workflows/ci.yml#L38C1-L39C51
> >
> > On Fri, Oct 27, 2023, at 22:17, Jean-Baptiste Onofré wrote:
> > > Thanks for the heads up Xuanwo.
> > >
> > > It's the fourth option :) I will make a comparison with RAT.
> > >
> > > Regards
> > > JB
> > >
> > > On Fri, Oct 27, 2023 at 12:15 PM Xuanwo <xua...@apache.org> wrote:
> > >>
> > >> iceberg-rust is using apache/skywalking-eyes/header@v0.5.0 now.
> > >>
> > >> BTW, we found skywalking-eyes works really well. It's fast, correct
> and well-maintained.
> > >>
> > >> Maybe worth take a look.
> > >>
> > >> On Fri, Oct 27, 2023, at 17:48, Jean-Baptiste Onofré wrote:
> > >> > By the way, as dev/check-license is also used in iceberg-python and
> > >> > iceberg-go repositories (iceberg-rust doesn't have it), maybe I can
> > >> > move forward on new rat release with the fix on hidden directories
> and
> > >> > update there as well.
> > >> >
> > >> > Regards
> > >> > JB
> > >> >
> > >> > On Thu, Oct 26, 2023 at 5:19 PM Jean-Baptiste Onofré <
> j...@nanthrax.net> wrote:
> > >> >>
> > >> >> Hi guys,
> > >> >>
> > >> >> During the 1.4.1 vote, we identified some files without ASF
> headers,
> > >> >> more specifically in hidden directories (like .baseline). These
> files
> > >> >> have not been detected by dev/check-license script.
> > >> >>
> > >> >> The reason is because the script uses apache-rat via java -jar (the
> > >> >> Apache RAT CLI), executing the RAT Report class ()
> > >> >>
> > >> >> When providing a directory to scan, the Report class uses a
> > >> >> DirectoryWalker to traverse the directories, looking for files to
> > >> >> check.
> > >> >> Unfortunately, by default, the DirectoryWalker ignores the hidden
> > >> >> directory (all directories starting with .):
> > >> >>
> > >> >>
> https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/walker/DirectoryWalker.java#L71
> > >> >>
> > >> >>
> https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/walker/Walker.java#L53
> > >> >>
> > >> >> In our case, it means that it ignores .baseline, .github, .git,
> > >> >> .palantir directories. This is not good as .baseline, .github and
> > >> >> .palantir directories are included in our source distribution, so
> the
> > >> >> ASF headers should be clean here.
> > >> >>
> > >> >> FYI, I will propose a change in rat to, at least, be able to use a
> > >> >> ReportConfiguration to define if we want to restrict directories or
> > >> >> not (be able to configure the Walkers basically). I will try to
> > >> >> include it in rat 0.17 (I discussed with Claude about that).
> > >> >> NB: rat gradle and maven plugins define their own
> Walkers/IReportable
> > >> >> to avoid this issue.
> > >> >>
> > >> >> I propose three options to improve this:
> > >> >> 1. We keep dev/check-license as it is today (using rat 0.15) and we
> > >> >> exclude .baseline, .github, .palantir directories from our source
> > >> >> distribution. It means that we will probably have issues while
> > >> >> building from source distribution. We can revisit dev/check-license
> > >> >> when we upgrade to rat 0.17
> > >> >>
> > >> >> 2. We keep dev/check-license but we create our own rat scanner
> with a
> > >> >> custom IReportable class considering all files/directories.
> Something
> > >> >> like:
> > >> >>
> > >> >>     public static void main(String[] args) throws Exception {
> > >> >>         ReportConfiguration reportConfiguration = new
> ReportConfiguration();
> > >> >>
>  reportConfiguration.setHeaderMatcher(Defaults.createDefaultMatcher());
> > >> >>
> > >> >>         XmlWriter writer = new XmlWriter(new
> FileWriter("report.xml"));
> > >> >>         ClaimStatistic claimStatistic = new ClaimStatistic();
> > >> >>         RatReport ratReport =
> XmlReportFactory.createStandardReport(
> > >> >>                 writer,
> > >> >>                 claimStatistic,
> > >> >>                 reportConfiguration
> > >> >>         );
> > >> >>         ratReport.startReport();
> > >> >>         IncludeHiddenDirectoryWalker walker = new
> > >> >> IncludeHiddenDirectoryWalker(new File("/path/to/iceberg"));
> > >> >>         walker.run(ratReport);
> > >> >>         ratReport.endReport();
> > >> >>         writer.closeDocument();
> > >> >>     }
> > >> >>
> > >> >> public class IncludeHiddenDirectoryWalker implements IReportable {
> > >> >>
> > >> >>
> > >> >>     private File file;
> > >> >>
> > >> >>     public IncludeHiddenDirectoryWalker(File file) {
> > >> >>         this.file = file;
> > >> >>     }
> > >> >>
> > >> >>     @Override
> > >> >>     public void run(RatReport report) throws RatException {
> > >> >>         process(report, file);
> > >> >>     }
> > >> >>
> > >> >>     public void process(RatReport report, File file) {
> > >> >>         final File[] files = file.listFiles();
> > >> >>         if (files != null) {
> > >> >>             for (File current : files) {
> > >> >>                 if (current.isDirectory()) {
> > >> >>                     process(report, current);
> > >> >>                 } else {
> > >> >>                     try {
> > >> >>                         Document document = new
> FileDocument(current);
> > >> >>                         report.report(document);
> > >> >>                     } catch (RatException e) {
> > >> >>                         System.err.println("Can't report file " +
> > >> >> current.getAbsolutePath() + ": " + e);
> > >> >>                     }
> > >> >>                 }
> > >> >>             }
> > >> >>         }
> > >> >>     }
> > >> >>
> > >> >> }
> > >> >>
> > >> >> So, I can contribute this in dev/src for example.
> > >> >>
> > >> >> 3. Instead of using dev/check-license, we can use the rat gradle
> > >> >> plugin (https://github.com/eskatos/creadur-rat-gradle/). I tested
> it
> > >> >> and it works fine as it uses a custom IReportable like:
> > >> >>
> > >> >>
> https://github.com/eskatos/creadur-rat-gradle/blob/master/src/main/kotlin/org/nosphere/apache/rat/RatWork.kt#L135
> > >> >>
> > >> >>
> https://github.com/eskatos/creadur-rat-gradle/blob/master/src/main/kotlin/org/nosphere/apache/rat/RatWork.kt#L189
> > >> >>
> > >> >> We can include this plugin the check gradle phase, meaning that we
> can
> > >> >> verify headers for each PR.
> > >> >>
> > >> >> My preference would be for 3, mainly because:
> > >> >> 1. it integrates smoothly in our gradle ecosystem, adding a new
> plugin
> > >> >> as we have gradle-baseline-java, gradle-errorprone-plugin,
> > >> >> spotless-plugin-gradle, etc
> > >> >> 2. As we can hook rat gradle plugin in the gradle check task, it
> means
> > >> >> license check will be perform at build time, including check on PR
> by
> > >> >> GitHub Actions
> > >> >>
> > >> >> If you are OK with 3, I will work on:
> > >> >> 1. a PR to use it
> > >> >> 2. a PR for website to update release check procedure
> > >> >>
> > >> >> Thoughts ?
> > >> >>
> > >> >> Regards
> > >> >> JB
> > >>
> > >> --
> > >> Xuanwo
> >
> > --
> > Xuanwo
>


-- 
Ryan Blue
Tabular

Reply via email to