mattcasters opened a new pull request, #7383:
URL: https://github.com/apache/hop/pull/7383

   # Walkthrough: Moving Average (Last N Events) Aggregation
   
   We have successfully implemented the **Moving Average (Last N Events)** 
aggregation type for the Group By transform and verified it with both unit 
tests and a pipeline integration test, as well as documented the feature.
   
   Issue: #7023
   
   ## Changes Made
   
   ### 1. Type Configuration & Model
   - 
**[Aggregation.java](file:///home/matt/git/mattcasters/hop/plugins/transforms/groupby/src/main/java/org/apache/hop/pipeline/transforms/groupby/Aggregation.java)**:
     - Added the integer constant `TYPE_GROUP_MOVING_AVERAGE = 23`.
     - Added short label `"MOVING_AVG"` to `typeGroupLabel` and long 
description key `MOVING_AVERAGE` to `typeGroupLongDesc`.
     - Added an `orderField` property with `@HopMetadataProperty` annotation to 
allow specifying the sort/order field. This is persisted to XML/JSON metadata 
automatically.
     - Updated `clone()`, `equals()`, and `hashCode()` to support the new field.
   
   ### 2. Runtime Implementation
   - 
**[GroupByData.java](file:///home/matt/git/mattcasters/hop/plugins/transforms/groupby/src/main/java/org/apache/hop/pipeline/transforms/groupby/GroupByData.java)**:
     - Added an array of sliding windows `movingAvgWindows` 
(ArrayDeque<Double>[]) to hold the values inside the rolling window.
     - Added list tracking fields (`movingAvgSourceIndexes`, 
`movingAvgTargetIndexes`, `movingAvgWidths`, `movingAvgIndexes`) for on-the-fly 
calculations.
   - 
**[GroupBy.java](file:///home/matt/git/mattcasters/hop/plugins/transforms/groupby/src/main/java/org/apache/hop/pipeline/transforms/groupby/GroupBy.java)**:
     - **processRow**: Initialized the sliding windows array and tracking lists 
if `MOVING_AVG` type is configured.
     - **newAggregate**: Resets and clears the sliding window array for the 
active aggregation index on group changes.
     - **addMovingAverages**: Fold new values into the `ArrayDeque`. Trims the 
deque to window size $N$. Computes average of elements. Emits `null` if window 
size is less than $N$ (partial window handling).
     - Appends `addMovingAverages` to the buffer replay loops to correctly 
calculate rolling averages row-by-row.
   - 
**[GroupByMeta.java](file:///home/matt/git/mattcasters/hop/plugins/transforms/groupby/src/main/java/org/apache/hop/pipeline/transforms/groupby/GroupByMeta.java)**:
     - Declared `MOVING_AVG` as outputting `IValueMeta.TYPE_NUMBER`.
   
   ### 3. UI Dialog
   - 
**[GroupByDialog.java](file:///home/matt/git/mattcasters/hop/plugins/transforms/groupby/src/main/java/org/apache/hop/pipeline/transforms/groupby/GroupByDialog.java)**:
     - Added a 5th column: **Order field** (populated via dropdown from 
previous step fields).
     - Configured `getData()` to load, `ok()` to retrieve/save, and 
`setComboBoxes()` to suggest field values for the new column.
     - Forces "Include all rows" checkbox selection when `MOVING_AVG` is 
selected.
   
   ### 4. Internationalization
   - 
**[messages_en_US.properties](file:///home/matt/git/mattcasters/hop/plugins/transforms/groupby/src/main/resources/org/apache/hop/pipeline/transforms/groupby/messages/messages_en_US.properties)**:
     - Added description: `Moving average (last N rows)`.
     - Added Order Field column name and tooltip descriptions.
   
   ### 5. Documentation
   - 
**[groupby.adoc](file:///home/matt/git/mattcasters/hop/docs/hop-user-manual/modules/ROOT/pages/pipeline/transforms/groupby.adoc)**:
     - Added `Moving average (last N rows)` to the lists of available aggregate 
methods.
     - Described specifying window size in the `Value` column and pre-sorting 
fields in the `Order field` column.
   
   ---
   
   ## Verification & Testing
   
   ### 1. Automated Unit Tests
   We created a new JUnit 5 test class:
   - 
**[MovingAverageAggregationTest.java](file:///home/matt/git/mattcasters/hop/plugins/transforms/groupby/src/test/java/org/apache/hop/pipeline/transforms/groupby/MovingAverageAggregationTest.java)**:
     - Tests partial window nulls, sliding window updates, null skipping, 
resets on group change, window size 1, and result pass-through.
   We updated:
   - 
**[GroupByMetaTest.java](file:///home/matt/git/mattcasters/hop/plugins/transforms/GroupByMetaTest.java)**:
     - Verifies round-trip XML configuration serialization/deserialization.
   
   ### 2. Integration Pipeline Unit Test
   We created a self-contained pipeline test case inside the integration tests 
project:
   - 
**[0006-groupby-moving-average.hpl](file:///home/matt/git/mattcasters/hop/integration-tests/transforms/0006-groupby-moving-average.hpl)**:
 Generates a sorted dataset of values for two groups and runs the Group By 
transform with $N=3$ moving average, followed by a `Validate` Dummy step.
   - 
**[golden-groupby-moving-average.csv](file:///home/matt/git/mattcasters/hop/integration-tests/transforms/datasets/golden-groupby-moving-average.csv)**
 & 
**[golden-groupby-moving-average.json](file:///home/matt/git/mattcasters/hop/integration-tests/transforms/metadata/dataset/golden-groupby-moving-average.json)**:
 Golden dataset containing expected rolling average outputs (retains empty 
fields for partial windows).
   - **[0006-groupby-moving-average 
UNIT.json](file:///home/matt/git/mattcasters/hop/integration-tests/transforms/metadata/unit-test/0006-groupby-moving-average%20UNIT.json)**:
 Pipeline unit test mapping `Validate` transform results to the golden data set.
   - 
**[main-0006-groupby.hwf](file:///home/matt/git/mattcasters/hop/integration-tests/transforms/main-0006-groupby.hwf)**:
 Action `Run Group By tests` now executes our moving average unit test as well.
   
   ---
   
   ## Test Executions
   
   1. **Transform Unit Tests**:
      ```bash
      ./mvnw clean test -pl plugins/transforms/groupby 
-Dtest="GroupByMetaTest,MovingAverageAggregationTest"
      ```
      **Result**: Build Success. All 8 tests passed successfully!
   
   2. **Integration Test Workflow**:
      ```bash
      sh hop run -e "IT transforms" -f main-0006-groupby.hwf -r local
      ```
      **Result**: Workflow execution finished successfully. `Validate - 
golden-groupby-moving-average : Test passed successfully against golden data 
set`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to