iceberg-rust is using apache/skywalking-eyes/header@v0.5.0 now.

BTW, we found skywalking-eyes works really well. It's fast, correct and 
well-maintained.

Maybe worth take a look.

On Fri, Oct 27, 2023, at 17:48, Jean-Baptiste Onofré wrote:
> By the way, as dev/check-license is also used in iceberg-python and
> iceberg-go repositories (iceberg-rust doesn't have it), maybe I can
> move forward on new rat release with the fix on hidden directories and
> update there as well.
>
> Regards
> JB
>
> On Thu, Oct 26, 2023 at 5:19 PM Jean-Baptiste Onofré <j...@nanthrax.net> 
> wrote:
>>
>> Hi guys,
>>
>> During the 1.4.1 vote, we identified some files without ASF headers,
>> more specifically in hidden directories (like .baseline). These files
>> have not been detected by dev/check-license script.
>>
>> The reason is because the script uses apache-rat via java -jar (the
>> Apache RAT CLI), executing the RAT Report class ()
>>
>> When providing a directory to scan, the Report class uses a
>> DirectoryWalker to traverse the directories, looking for files to
>> check.
>> Unfortunately, by default, the DirectoryWalker ignores the hidden
>> directory (all directories starting with .):
>>
>> https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/walker/DirectoryWalker.java#L71
>>
>> https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/walker/Walker.java#L53
>>
>> In our case, it means that it ignores .baseline, .github, .git,
>> .palantir directories. This is not good as .baseline, .github and
>> .palantir directories are included in our source distribution, so the
>> ASF headers should be clean here.
>>
>> FYI, I will propose a change in rat to, at least, be able to use a
>> ReportConfiguration to define if we want to restrict directories or
>> not (be able to configure the Walkers basically). I will try to
>> include it in rat 0.17 (I discussed with Claude about that).
>> NB: rat gradle and maven plugins define their own Walkers/IReportable
>> to avoid this issue.
>>
>> I propose three options to improve this:
>> 1. We keep dev/check-license as it is today (using rat 0.15) and we
>> exclude .baseline, .github, .palantir directories from our source
>> distribution. It means that we will probably have issues while
>> building from source distribution. We can revisit dev/check-license
>> when we upgrade to rat 0.17
>>
>> 2. We keep dev/check-license but we create our own rat scanner with a
>> custom IReportable class considering all files/directories. Something
>> like:
>>
>>     public static void main(String[] args) throws Exception {
>>         ReportConfiguration reportConfiguration = new ReportConfiguration();
>>         
>> reportConfiguration.setHeaderMatcher(Defaults.createDefaultMatcher());
>>
>>         XmlWriter writer = new XmlWriter(new FileWriter("report.xml"));
>>         ClaimStatistic claimStatistic = new ClaimStatistic();
>>         RatReport ratReport = XmlReportFactory.createStandardReport(
>>                 writer,
>>                 claimStatistic,
>>                 reportConfiguration
>>         );
>>         ratReport.startReport();
>>         IncludeHiddenDirectoryWalker walker = new
>> IncludeHiddenDirectoryWalker(new File("/path/to/iceberg"));
>>         walker.run(ratReport);
>>         ratReport.endReport();
>>         writer.closeDocument();
>>     }
>>
>> public class IncludeHiddenDirectoryWalker implements IReportable {
>>
>>
>>     private File file;
>>
>>     public IncludeHiddenDirectoryWalker(File file) {
>>         this.file = file;
>>     }
>>
>>     @Override
>>     public void run(RatReport report) throws RatException {
>>         process(report, file);
>>     }
>>
>>     public void process(RatReport report, File file) {
>>         final File[] files = file.listFiles();
>>         if (files != null) {
>>             for (File current : files) {
>>                 if (current.isDirectory()) {
>>                     process(report, current);
>>                 } else {
>>                     try {
>>                         Document document = new FileDocument(current);
>>                         report.report(document);
>>                     } catch (RatException e) {
>>                         System.err.println("Can't report file " +
>> current.getAbsolutePath() + ": " + e);
>>                     }
>>                 }
>>             }
>>         }
>>     }
>>
>> }
>>
>> So, I can contribute this in dev/src for example.
>>
>> 3. Instead of using dev/check-license, we can use the rat gradle
>> plugin (https://github.com/eskatos/creadur-rat-gradle/). I tested it
>> and it works fine as it uses a custom IReportable like:
>>
>> https://github.com/eskatos/creadur-rat-gradle/blob/master/src/main/kotlin/org/nosphere/apache/rat/RatWork.kt#L135
>>
>> https://github.com/eskatos/creadur-rat-gradle/blob/master/src/main/kotlin/org/nosphere/apache/rat/RatWork.kt#L189
>>
>> We can include this plugin the check gradle phase, meaning that we can
>> verify headers for each PR.
>>
>> My preference would be for 3, mainly because:
>> 1. it integrates smoothly in our gradle ecosystem, adding a new plugin
>> as we have gradle-baseline-java, gradle-errorprone-plugin,
>> spotless-plugin-gradle, etc
>> 2. As we can hook rat gradle plugin in the gradle check task, it means
>> license check will be perform at build time, including check on PR by
>> GitHub Actions
>>
>> If you are OK with 3, I will work on:
>> 1. a PR to use it
>> 2. a PR for website to update release check procedure
>>
>> Thoughts ?
>>
>> Regards
>> JB

-- 
Xuanwo

Reply via email to