iemejia opened a new pull request, #3580:
URL: https://github.com/apache/parquet-java/pull/3580

   ## Summary
   
   Remove the codec matrix dimension from the CI Hadoop 3 workflow. The codec 
split only parameterized a single integration test (`FileEncodingsIT`) while 
duplicating the entire build+test suite across 4 jobs instead of 2.
   
   ## Motivation
   
   Analysis of recent CI runs shows:
   - The `verify` step takes ~12 min (JDK 17) / ~16 min (JDK 11) **regardless 
of which codecs are set**
   - The difference between codec variants for the same JDK is <1 minute
   - Each redundant job wastes ~6 min of `before_install` + `install` overhead
   - Total savings: **~46% fewer billable CI minutes** (from ~81 to ~44 min per 
run)
   
   ## Changes
   
   - Remove the `codes` matrix dimension (was: `['uncompressed,brotli', 
'gzip,snappy']`)
   - Set `TEST_CODECS` to all codecs in a single string: 
`uncompressed,brotli,gzip,snappy,zstd`
   - Add `zstd` to the tested codecs (was previously missing)
   - Jobs go from 4 (2 JDKs x 2 codec groups) to 2 (2 JDKs)
   
   ## Impact
   
   - Wall-clock CI time: unchanged (jobs already ran in parallel)
   - Billable minutes: ~46% reduction
   - Test coverage: improved (adds zstd)
   - `FileEncodingsIT` now tests 8 types x 5 codecs = 40 parameterized cases in 
one shot


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to