ascheman opened a new issue, #11612: URL: https://github.com/apache/maven/issues/11612
### New feature, improvement proposal ## Summary A proposal on [Build Sources Validation](https://cwiki.apache.org/confluence/x/84TMFw) (originally started by [mail on the dev list](https://lists.apache.org/thread/blsg4q490dk03qdmol2y5rrckw64lwqm)) was partly implemented as [Phase 1 (Module-aware resource handling)](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Phase1:Module-AwareResourceHandling(Implemented)) by PR #11505. This issue addresses [Phase 2](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Phase2:ComprehensiveRefactoring): extending the unified handling approach to all `<lang>` and `<scope>` combinations, not just resources. ## Additional Motivation As noted by @desruisseaux in [PR #11505 comments](https://github.com/apache/maven/pull/11505#discussion_r1899936913): > I thought that what this PR does for resources, the same thing could be done for tests. I'm thinking to refactor the `ResourceHandler` class in something that does not handle resources in a special way, but instead does the same thing in a uniform way for all combinations of `<lang>` and `<scope>`. ## Challenge: Multiple Source Directories Allowing multiple source directories (removing R3, see below) raises questions about data structures and handling: 1. **Multiple directories per (lang, scope, module)**: Users may legitimately configure multiple [`<source>`](https://maven.apache.org/ref/4-LATEST/api/maven-api-model/maven.html#class_source) directories for the same combination, e.g.: ```xml <sources> <!-- Implicit directory: src/com.example.app/main/java --> <source><lang>java</lang><scope>main</scope><module>com.example.app</module></source> <!-- Additional generated sources for the same module --> <source><lang>java</lang><scope>main</scope><module>com.example.app</module> <directory>target/generated-sources/com.example.app/java</directory></source> </sources> ``` 2. **Duplicate handling strategy**: The `<source>` element has additional attributes (`enabled`, `targetPath`, `stringFiltering`, `includes`, `excludes`, `targetVersion`). When the same `(lang, scope, module, directory)` tuple appears multiple times, we need a defined handling strategy: ```xml <source><lang>java</lang><scope>main</scope><module>com.example</module> <directory>src/com.example/main/java</directory> <enabled>true</enabled></source> <source><lang>java</lang><scope>main</scope><module>com.example</module> <directory>src/com.example/main/java</directory> <enabled>false</enabled></source> <!-- How to handle? --> ``` **Proposed strategy: `enabled` as key discriminator** - Use `(lang, scope, module, directory)` as identity - First entry with `enabled=true` wins and defines the effective configuration - Subsequent entries with same identity AND `enabled=true` trigger a **WARNING** (alternative: ERROR, since user could simply remove the duplicate) - Entries with `enabled=false` are harmless – they explicitly disable that source - Duplicate detection applies **only to enabled sources** – disabled sources are effectively no-ops for tracking **Note on current implementation:** The current implementation already filters by `enabled` when retrieving sources via [`MavenProject.getEnabledSourceRoots()`](https://github.com/apache/maven/blob/master/impl/maven-core/src/main/java/org/apache/maven/project/MavenProject.java). However, during source processing in `DefaultProjectBuilder`, all sources are added regardless of `enabled` status, and the tracking flags (`hasMain`, `hasTest`, etc.) don't currently consider `enabled`. Phase 2 should align the processing logic with the retrieval logic by only tracking enabled sources. 3. **Target-oriented view**: Regardless of input structure, all sources eventually compile to target directories: - `target/classes/<module>/` (main scope) - `target/test-classes/<module>/` (test scope) The compiler/processor handles multiple input directories naturally – they all feed into the same output location. The appropriate data structure and algorithm to support this should be discussed further before implementation. ## Proposed Changes ### 1. Generalize `ResourceHandlingContext` to `SourceHandlingContext` Replace the current [`ResourceHandlingContext`](https://github.com/apache/maven/blob/master/impl/maven-core/src/main/java/org/apache/maven/project/ResourceHandlingContext.java) approach with a flexible, unified source handling mechanism. **Current (Phase 1) - extended boolean flags** (Pseudo-code): ```java // Hardcoded tracking for specific combinations boolean hasMain = false; boolean hasTest = false; boolean hasMainResources = false; boolean hasTestResources = false; for (var source : sources) { if (lang == JAVA && scope == MAIN) hasMain = true; if (lang == JAVA && scope == TEST) hasTest = true; if (lang == RESOURCES && scope == MAIN) hasMainResources = true; if (lang == RESOURCES && scope == TEST) hasTestResources = true; } // Resource-specific handling context.handleResourceConfiguration(MAIN, hasMainResources); context.handleResourceConfiguration(TEST, hasTestResources); ``` **Proposed (Phase 2) – Flexible set-based tracking:** ```java // Generic tracking for ANY lang/scope/module/directory combination record SourceKey(Language lang, ProjectScope scope, String module, Path directory) {} Set<SourceKey> declaredSources = new HashSet<>(); // We need to handle some special cases, e.g., implicit directory etc. for (var source : sources) { SourceKey key = new SourceKey(lang, scope, module, directory); if (source.isEnabled()) { if (declaredSources.contains(key)) { // Duplicate enabled source - emit WARNING (or ERROR) warn("Duplicate enabled source for " + key); } else { declaredSources.add(key); } } // disabled sources are not tracked - they are explicitly disabled by the user // and won't be returned by getEnabledSourceRoots() anyway } // Unified handling for ALL source types (java, resources, kotlin, etc.) for (SourceKey key : declaredSources) { // Same logic for java, resources, kotlin, groovy, etc. handleSourceConfiguration(key.lang(), key.scope(), key.module(), key.directory()); } ``` **Benefits:** - Replace hardcoded boolean flags with a flexible set structure - Generalize handling logic (currently resource-only) to work for ALL lang/scope combinations - Use ONE unified approach instead of special-casing resources - Easily extensible for new languages (kotlin, groovy, scala, etc.) ### 2. Remove Validation Rule R3 (Duplicate Sources) Per @desruisseaux suggestion in [his mailing list reply](https://lists.apache.org/thread/r1vqttptg62tqlt96vtrm605xyxjj095): > Remove rule R3, i.e., accept duplicated source for `scope='X'`, `lang='Y'`, `module='Z'`. The compiler plugin 4 already accepts that, the need for multi-source directories in Maven 3 was strong enough to justify an external plugin, and Maven 3 already accepts that for resources if I remember right. **Rationale:** - Maven Compiler Plugin 4 already supports multiple source directories - The `build-helper-maven-plugin` exists specifically because users needed multi-source support - Maven 3 already allows duplicate resource directories ### 3. Unified Source Directory Handling Apply the same modular source detection logic from [Phase 1](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Phase1:Module-AwareResourceHandling(Implemented)) to: - Java sources (`src/<module>/main/java`, `src/<module>/test/java`) - Test sources - Other language sources (kotlin, groovy, scala, etc.) ## Implementation Notes ### Files to Modify - `impl/maven-core/src/main/java/org/apache/maven/project/ResourceHandlingContext.java` - Rename/refactor to `SourceHandlingContext` - Generalize `handleResourceConfiguration()` to handle all source types - Add `shouldAddSource()` for duplicate detection - Add `hasSources(Language, ProjectScope)` to replace boolean flags - `impl/maven-core/src/main/java/org/apache/maven/project/DefaultProjectBuilder.java` - Replace hardcoded boolean flags (`hasMain`, `hasTest`, etc.) with `SourceHandlingContext.hasSources()` - Use `SourceHandlingContext.shouldAddSource()` for duplicate detection ### No modification necessary (due to dropped R3) - `impl/maven-impl/src/main/java/org/apache/maven/impl/model/DefaultModelValidator.java` - R3 validation (duplicate sources detection) was originally proposed but **never implemented** in the validator - Duplicate detection is handled in `SourceHandlingContext.shouldAddSource()` instead - R1, R2, R4 validations remain unchanged ### Validation Rules After Phase 2 | Rule | Description | Addresses | Status | |--------|---------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|------------| | R1 | `<sources>` conflicts with explicit `<sourceDirectory>` | - | Keep | | R2 | Mixed modular/non-modular sources | [Problem 1](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Problem1:MixedModular/Non-ModularSourcesAcceptedSilently) | Keep | | ~~R3~~ | ~~Duplicate sources warning~~ | - | **Remove** | | R4 | Filesystem warning for classic directory | [Problem 4](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Problem4:LegacyConfigurationSilentlyIgnored) | Keep | **Note:** [Problem 2 (Test-Only Modular Projects)](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Problem2:Test-OnlyModularProjectsGetClassicMainSourcesInjected) and [Problem 3 (Resources Are Not Module-Aware)](https://cwiki.apache.org/confluence/x/84TMFw#BuildSourcesValidationDiscussionNotes-Problem3:ResourcesAreNotModule-Aware) were addressed by source injection logic in Phase 1. ## Acceptance Criteria | # | Criterion | Description | |-----|----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | AC1 | Boolean flags eliminated | The hardcoded `hasMain`, `hasTest`, `hasMainResources`, `hasTestResources`, `hasScript` flags are replaced with flexible set-based tracking via `SourceHandlingContext.hasSources()` | | AC2 | Unified source tracking | All lang/scope combinations use the same tracking mechanism (not just resources) | | AC3 | Duplicate detection | Duplicate enabled sources with same `(lang, scope, module, directory)` identity trigger a WARNING | | AC4 | First enabled wins | Only the first enabled source for an identity is added to the project; duplicates are skipped | | AC5 | Disabled sources unchanged | Disabled sources are still added but filtered by `getEnabledSourceRoots()` | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
