ascheman opened a new issue, #11612:
URL: https://github.com/apache/maven/issues/11612

   ### New feature, improvement proposal
   
   ## Summary
   
   A proposal on [Build Sources 
Validation](https://cwiki.apache.org/confluence/x/84TMFw) (originally started 
by [mail on the dev 
list](https://lists.apache.org/thread/blsg4q490dk03qdmol2y5rrckw64lwqm)) was 
partly implemented as [Phase 1 (Module-aware resource 
handling)](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Phase1:Module-AwareResourceHandling(Implemented))
 by PR #11505.
   This issue addresses [Phase 
2](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Phase2:ComprehensiveRefactoring):
 extending the unified handling approach to all `<lang>` and `<scope>` 
combinations, not just resources.
   
   ## Additional Motivation
   
   As noted by @desruisseaux in [PR #11505 
comments](https://github.com/apache/maven/pull/11505#discussion_r1899936913):
   
   > I thought that what this PR does for resources, the same thing could be 
done for tests. I'm thinking to refactor the `ResourceHandler` class in 
something that does not handle resources in a special way, but instead does the 
same thing in a uniform way for all combinations of `<lang>` and `<scope>`.
   
   ## Challenge: Multiple Source Directories
   
   Allowing multiple source directories (removing R3, see below) raises 
questions about data structures and handling:
   
   1. **Multiple directories per (lang, scope, module)**: Users may 
legitimately configure multiple 
[`<source>`](https://maven.apache.org/ref/4-LATEST/api/maven-api-model/maven.html#class_source)
 directories for the same combination, e.g.:
      ```xml
      <sources>
        <!-- Implicit directory: src/com.example.app/main/java -->
        
<source><lang>java</lang><scope>main</scope><module>com.example.app</module></source>
        <!-- Additional generated sources for the same module -->
        
<source><lang>java</lang><scope>main</scope><module>com.example.app</module>
          
<directory>target/generated-sources/com.example.app/java</directory></source>
      </sources>
      ```
   
   2. **Duplicate handling strategy**: The `<source>` element has additional 
attributes (`enabled`, `targetPath`, `stringFiltering`, `includes`, `excludes`, 
`targetVersion`). When the same `(lang, scope, module, directory)` tuple 
appears multiple times, we need a defined handling strategy:
      ```xml
      <source><lang>java</lang><scope>main</scope><module>com.example</module>
        <directory>src/com.example/main/java</directory>
        <enabled>true</enabled></source>
      <source><lang>java</lang><scope>main</scope><module>com.example</module>
        <directory>src/com.example/main/java</directory>
        <enabled>false</enabled></source>  <!-- How to handle? -->
      ```
   
      **Proposed strategy: `enabled` as key discriminator**
      - Use `(lang, scope, module, directory)` as identity
      - First entry with `enabled=true` wins and defines the effective 
configuration
      - Subsequent entries with same identity AND `enabled=true` trigger a 
**WARNING** (alternative: ERROR, since user could simply remove the duplicate)
      - Entries with `enabled=false` are harmless – they explicitly disable 
that source
      - Duplicate detection applies **only to enabled sources** – disabled 
sources are effectively no-ops for tracking
   
      **Note on current implementation:** The current implementation already 
filters by `enabled` when retrieving sources via 
[`MavenProject.getEnabledSourceRoots()`](https://github.com/apache/maven/blob/master/impl/maven-core/src/main/java/org/apache/maven/project/MavenProject.java).
 However, during source processing in `DefaultProjectBuilder`, all sources are 
added regardless of `enabled` status, and the tracking flags (`hasMain`, 
`hasTest`, etc.) don't currently consider `enabled`. Phase 2 should align the 
processing logic with the retrieval logic by only tracking enabled sources.
   
   3. **Target-oriented view**: Regardless of input structure, all sources 
eventually compile to target directories:
      - `target/classes/<module>/` (main scope)
      - `target/test-classes/<module>/` (test scope)
   
      The compiler/processor handles multiple input directories naturally – 
they all feed into the same output location.
   
   The appropriate data structure and algorithm to support this should be 
discussed further before implementation.
   
   ## Proposed Changes
   
   ### 1. Generalize `ResourceHandlingContext` to `SourceHandlingContext`
   
   Replace the current 
[`ResourceHandlingContext`](https://github.com/apache/maven/blob/master/impl/maven-core/src/main/java/org/apache/maven/project/ResourceHandlingContext.java)
 approach with a flexible, unified source handling mechanism.
   
   **Current (Phase 1) - extended boolean flags** (Pseudo-code):
   ```java
   // Hardcoded tracking for specific combinations
   boolean hasMain = false;
   boolean hasTest = false;
   boolean hasMainResources = false;
   boolean hasTestResources = false;
   
   for (var source : sources) {
       if (lang == JAVA && scope == MAIN) hasMain = true;
       if (lang == JAVA && scope == TEST) hasTest = true;
       if (lang == RESOURCES && scope == MAIN) hasMainResources = true;
       if (lang == RESOURCES && scope == TEST) hasTestResources = true;
   }
   
   // Resource-specific handling
   context.handleResourceConfiguration(MAIN, hasMainResources);
   context.handleResourceConfiguration(TEST, hasTestResources);
   ```
   
   **Proposed (Phase 2) – Flexible set-based tracking:**
   ```java
   // Generic tracking for ANY lang/scope/module/directory combination
   record SourceKey(Language lang, ProjectScope scope, String module, Path 
directory) {}
   Set<SourceKey> declaredSources = new HashSet<>();
   
   // We need to handle some special cases, e.g., implicit directory etc.
   for (var source : sources) {
       SourceKey key = new SourceKey(lang, scope, module, directory);
       if (source.isEnabled()) {
           if (declaredSources.contains(key)) {
               // Duplicate enabled source - emit WARNING (or ERROR)
               warn("Duplicate enabled source for " + key);
           } else {
               declaredSources.add(key);
           }
       }
       // disabled sources are not tracked - they are explicitly disabled by 
the user
       // and won't be returned by getEnabledSourceRoots() anyway
   }
   
   // Unified handling for ALL source types (java, resources, kotlin, etc.)
   for (SourceKey key : declaredSources) {
       // Same logic for java, resources, kotlin, groovy, etc.
       handleSourceConfiguration(key.lang(), key.scope(), key.module(), 
key.directory());
   }
   ```
   
   **Benefits:**
   - Replace hardcoded boolean flags with a flexible set structure
   - Generalize handling logic (currently resource-only) to work for ALL 
lang/scope combinations
   - Use ONE unified approach instead of special-casing resources
   - Easily extensible for new languages (kotlin, groovy, scala, etc.)
   
   ### 2. Remove Validation Rule R3 (Duplicate Sources)
   
   Per @desruisseaux suggestion in [his mailing list 
reply](https://lists.apache.org/thread/r1vqttptg62tqlt96vtrm605xyxjj095):
   
   > Remove rule R3, i.e., accept duplicated source for `scope='X'`, 
`lang='Y'`, `module='Z'`. The compiler plugin 4 already accepts that, the need 
for multi-source directories in Maven 3 was strong enough to justify an 
external plugin, and Maven 3 already accepts that for resources if I remember 
right.
   
   **Rationale:**
   - Maven Compiler Plugin 4 already supports multiple source directories
   - The `build-helper-maven-plugin` exists specifically because users needed 
multi-source support
   - Maven 3 already allows duplicate resource directories
   
   ### 3. Unified Source Directory Handling
   
   Apply the same modular source detection logic from [Phase 
1](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Phase1:Module-AwareResourceHandling(Implemented))
 to:
   - Java sources (`src/<module>/main/java`, `src/<module>/test/java`)
   - Test sources
   - Other language sources (kotlin, groovy, scala, etc.)
   
   ## Implementation Notes
   
   ### Files to Modify
   
   - 
`impl/maven-core/src/main/java/org/apache/maven/project/ResourceHandlingContext.java`
     - Rename/refactor to `SourceHandlingContext`
     - Generalize `handleResourceConfiguration()` to handle all source types
     - Add `shouldAddSource()` for duplicate detection
     - Add `hasSources(Language, ProjectScope)` to replace boolean flags
   
   - 
`impl/maven-core/src/main/java/org/apache/maven/project/DefaultProjectBuilder.java`
     - Replace hardcoded boolean flags (`hasMain`, `hasTest`, etc.) with 
`SourceHandlingContext.hasSources()`
     - Use `SourceHandlingContext.shouldAddSource()` for duplicate detection
   
   ### No modification necessary (due to dropped R3)
   
   - 
`impl/maven-impl/src/main/java/org/apache/maven/impl/model/DefaultModelValidator.java`
     - R3 validation (duplicate sources detection) was originally proposed but 
**never implemented** in the validator
     - Duplicate detection is handled in 
`SourceHandlingContext.shouldAddSource()` instead
     - R1, R2, R4 validations remain unchanged
   
   ### Validation Rules After Phase 2
   
   | Rule   | Description                                             | 
Addresses                                                                       
                                                                         | 
Status     |
   
|--------|---------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
   | R1     | `<sources>` conflicts with explicit `<sourceDirectory>` | -       
                                                                                
                                                                 | Keep       |
   | R2     | Mixed modular/non-modular sources                       | 
[Problem 
1](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Problem1:MixedModular/Non-ModularSourcesAcceptedSilently)
 | Keep       |
   | ~~R3~~ | ~~Duplicate sources warning~~                           | -       
                                                                                
                                                                 | **Remove** |
   | R4     | Filesystem warning for classic directory                | 
[Problem 
4](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Problem4:LegacyConfigurationSilentlyIgnored)
              | Keep       |
   
   **Note:** [Problem 2 (Test-Only Modular 
Projects)](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Problem2:Test-OnlyModularProjectsGetClassicMainSourcesInjected)
 and [Problem 3 (Resources Are Not 
Module-Aware)](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Problem3:ResourcesAreNotModule-Aware)
 were addressed by source injection logic in Phase 1.
   
   ## Acceptance Criteria
   
   | #   | Criterion                  | Description                             
                                                                                
                                                             |
   
|-----|----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
   | AC1 | Boolean flags eliminated   | The hardcoded `hasMain`, `hasTest`, 
`hasMainResources`, `hasTestResources`, `hasScript` flags are replaced with 
flexible set-based tracking via `SourceHandlingContext.hasSources()` |
   | AC2 | Unified source tracking    | All lang/scope combinations use the 
same tracking mechanism (not just resources)                                    
                                                                 |
   | AC3 | Duplicate detection        | Duplicate enabled sources with same 
`(lang, scope, module, directory)` identity trigger a WARNING                   
                                                                 |
   | AC4 | First enabled wins         | Only the first enabled source for an 
identity is added to the project; duplicates are skipped                        
                                                                |
   | AC5 | Disabled sources unchanged | Disabled sources are still added but 
filtered by `getEnabledSourceRoots()`                                           
                                                                |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to