xinxinzhenhuai opened a new pull request, #78:
URL: https://github.com/apache/datasketches-postgresql/pull/78

   #### Context
   This PR is a follow-up of 
https://github.com/apache/datasketches-postgresql/issues/77
   The test `aod_sketch_test.sql` failed at line 25 with `Segmentation fault`
   ```
    psql:test/aod_sketch_test.sql:25: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
    psql:test/aod_sketch_test.sql:25: error: connection to server was lost
    2025-05-23 21:14:29.609 UTC [13567] LOG:  server process (PID 13607) was 
terminated by signal 11: Segmentation fault
    2025-05-23 21:14:29.609 UTC [13567] DETAIL:  Failed process was running: 
select aod_sketch_get_estimate(aod_sketch_union(sketch, 16)) from 
aod_sketch_test;
   ```
   
   The issue stems from a misuse of the aod_sketch_union function. In the 
following line:
   ```
   select aod_sketch_get_estimate(aod_sketch_union(sketch, 16)) from 
aod_sketch_test;
   ```
   The original intention was to set `lgk` to 16. However, the second argument 
of `aod_sketch_union` is actually `num_values`.  The test table 
`aod_sketch_test` was initialized with sketches having `num_values = 1` 
(array[1]), as shown below:
   ```
   -- lgk = 16
   insert into aod_sketch_test
     select aod_sketch_build(key, aod, 16)
     from (values (4, array[1]), (5, array[1]), (6, array[1]), (7, array[1]), 
(8, array[1])) as t(key, aod);
   ```
   Passing 16 as `num_value`s results in out-of-bounds access during the union 
operation, which is an undefined behavior. In our tests running on OEL7, this 
corrupted the array and triggered a segmentation fault. Below is the code of 
out-of-bound array access: 
   ```
   void operator()(Array& array, const Array& other) const {
     for (uint8_t i = 0; i < num_values_; ++i) {
       array[i] += other[i];
     }
   }
   ```
   #### Changes
   1. Updated aod_sketch_test.sql to pass in correct lgk
   2. Added `num_values` checks in `pg_aod_sketch_union_agg`
   
   #### Testings
   ```
    aod_sketch_get_estimate 
   -------------------------
                          8
   (1 row)
   
    aod_sketch_get_estimate 
   -------------------------
                          8
   (1 row)
   
    aod_sketch_get_estimate 
   -------------------------
                          2
   (1 row)
   
    aod_sketch_get_estimate 
   -------------------------
                          1
   (1 row)
   ```
   If we pass an out-of-bound num_values
   ```
    aod_sketch_get_estimate 
   -------------------------
                          8
   (1 row)
   
   2025-06-06 00:25:14.352 UTC [19080] ERROR:  pg_aod_sketch_union_agg expects 
the same num_values in union and sketch
   2025-06-06 00:25:14.352 UTC [19080] STATEMENT:  select 
aod_sketch_get_estimate(aod_sketch_union(sketch, 16)) from aod_sketch_test;
   psql:test/aod_sketch_test.sql:25: ERROR:  pg_aod_sketch_union_agg expects 
the same num_values in union and sketch
    aod_sketch_get_estimate 
   -------------------------
                          2
   (1 row)
   
    aod_sketch_get_estimate 
   -------------------------
                          1
   (1 row)
   ```
   #### Appendix
   I have added a few log lines around the `aod_union_update`. Before the union 
operation, I could iterate over the sketch and print out its values. After the 
update, iterating over the sketch will produce `Segmentation Fault`
   ```
   psql:test/aod_sketch_test.sql:25: INFO:  after update
   psql:test/aod_sketch_test.sql:25: INFO:  num_values: 1
   psql:test/aod_sketch_test.sql:25: INFO:  num_retained: 5
   psql:test/aod_sketch_test.sql:25: INFO:  entry.first: 4611686018427387904
   psql:test/aod_sketch_test.sql:25: server closed the connection unexpectedly
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to