On Thu, 8 May 2025 at 18:01, Daniel P. Berrangé <berra...@redhat.com> wrote:
>
> The new attempt at detecting missing SPDX-License-Identifier in
> new files is using the following logic
>
>  * When seeing a line starting 'diff --git ...' it indicates
>    the start of a file in the patch. This must trigger reporting
>    of violations in the previous file (if any).
>
>    It must reset the validation state, since this may now be a
>    pre-existing file being changed. This will be resolved by
>    the next rule.
>
>  * When seeing a line starting 'new file mode...' it indicates
>    a newly created file and must enable SPDX validation.
>
>  * When seeing EOF, it must trigger reporting of violations in
>    the last new file in the patch, if any.

> @@ -1442,6 +1457,8 @@ sub process {
>         my $in_imported_file = 0;
>         my $in_no_imported_file = 0;
>         my $non_utf8_charset = 0;
> +       my $expect_spdx = 0;
> +       my $expect_spdx_file;
>
>         our @report = ();
>         our $cnt_lines = 0;
> @@ -1679,9 +1696,38 @@ sub process {
>                         WARN("added, moved or deleted file(s), does 
> MAINTAINERS need updating?\n" . $herecurr);
>                 }
>
> +# All new files should have a SPDX-License-Identifier tag
> +               if ($line =~ /^diff --git/) {
> +                   # Start of file diff marker, report last file if it failed
> +                   # SPDX validation
> +                   if (defined $expect_spdx_file) {
> +                       &check_spdx_present($expect_spdx_file);
> +                   }
> +
> +                   # Reset state ready to find new file
> +                   $expect_spdx = 0;
> +                   $expect_spdx_file = undef;


We already have a point in the code where we say "ah, this looks
like the start of a new file" (under the comment "extract the
filename as it passes"), and it looks for two possible regexes,
not just "diff --git". Maybe we should combine these so that
we have something like

                if ($line =~ /^diff --git.*?(\S+)$/) {
                        handle_end_of_file($realfile) if $realfile ne '';
                        $realfile = $1;
                        $realfile =~ s@^([^/]*)/@@ if (!$file);
                        handle_start_of_file($realfile);
                } elsif ($line =~ /^\+\+\+\s+(\S+)/) {
                        handle_end_of_file($realfile) if $realfile ne '';
                        $realfile = $1;
                        $realfile =~ s@^([^/]*)/@@ if (!$file);
                        handle_start_of_file($realfile);

                        $p1_prefix = $1;
                        if (!$file && $tree && $p1_prefix ne '' &&
                            -e "$root/$p1_prefix") {
                                WARN("patch prefix '$p1_prefix'
exists, appears to be a -p0 patch\n");
                        }

                        next;

(side note: seems odd that we have 'next' here but not in the
previous half of this if()...)

                }

and move checkfilename() to inside handle_start_of_file(),
and have the spdx check handling done inside
handle_start_of_file() and handle_end_of_file() ?

> +       # End of diff, report last file block if it failed

and we would call
          handle_end_of_file($realfile) if $realfile ne '';

here too.

thanks
-- PMM

Reply via email to