The GitHub Actions job "CI" on kvrocks.git has failed.
Run started by GitHub user git-hulk (triggered by git-hulk).

Head commit for run:
431277c786c32c049e464914b7e6a98a76d48867 / 纪华裕 <8042...@qq.com>
Add Redis-compatible cursors for `SCAN` commands (#1489)

We have added the following steps to the processing of the `SCAN`, `ZSCAN`, 
`SSCAN`, `HSCAN` commands:
- Before processing the command, convert the `numeric cursor` sent by the 
client back to the `keyname cursor`.
- After processing the command, convert the `keyname cursor` back to a `numeric 
cursor` and return it to the client, And store the conversion dictionary in the 
`Server#cursor_dict_`.

Since those steps are non-intrusive to the internal implementation of the 
`SCAN` command, we can ensure that the behavior of commands such as `ZSCAN`, 
`SSCAN`, and `HSCAN` is consistent with that of the `SCAN` command.
In the following, I will only use the `SCAN` command as an example to describe 
the new design.

### Cursor design

We call the new cursor the `numeric cursor` and the old one the `keyname 
cursor`. 
The numeric cursor is composed of 3 parts: 
- 1-16 bit is counter,and the first bit always 1
- 17-32 bit is timestamp
- 33-64 bit is hash of keyname

The `counter` is a 16-bit unsigned integer that is incremented by 1 every time. 
When the `counter` overflows, it returns to 0 because it is an unsigned number. 
Since our `cursor_dict_size` is a power of 2, the `counter` is continuous mod 
`cursor_dict_size`.
`timestap` is a 16-bit timestamp in seconds, which can store up to 9 hours.
`hash` is a 32-bit hash value of the `keyname`.

### Cursor dictionary(`Server#cursor_dict_`)

The cursor dictionary is an array with a length of 16384(1024*16), which is 
determined at compile time, and occupies about 640KB of memory. Including the 
length of the referenced keyname strings, its size is about 1-2M.

During converting the `keyname cursor` back to a `numeric cursor`, a new cursor 
is generated based on the above rules, and the index for storing the dictionary 
is determined based on the counter value (index = counter % DICT_SIZE). 

During converting the `numeric cursor` back to a `keyname cursor`, we get the 
counter from the cursor and calculate the index of the cursor in the 
`cursor_dict_` based on the counter. We only need to compare the cursor value 
of the item at that index with the input cursor value to determine if they are 
the same.

### Other information about the cursor
This design guarantees the validity of the latest 16384(1024*16) cursors, while 
cursors that are older or not generated in our system are considered invalid 
cursors. For invalid cursors, we treat them as a 0 cursor, which means we will 
start iterating over the collection from the beginning.

Our cursor is globally visible, and we store index information in the cursor. 
As long as the cursor remains valid, using the same cursor in different 
connections will produce the same results.

We prevent other users from guessing the data traversed by adjacent cursors by 
adding the hash value of the `keyname` to the cursor. If a user tries to obtain 
adjacent cursor information by traversing the hash, the cursor will become an 
invalid cursor before the traversal is complete because the size of the 32-bit 
space is much larger than the length of the `cursor_dict_`.

We add a timestamp to the cursor to ensure that the same cursor does not appear 
within a short period of time before and after a restart.

Other behaviors are consistent with the original SCAN implementation.

### Configuration file
Added `redis-cursor-compatible` configuration item.
If enabled, the cursor will be an unsigned 64-bit integer.
If disabled, the cursor will be a string.

### Test file
Added tests for `redis-cli --bigkey` and `redis-cli --memkeys` commands. We 
only need to ensure that these commands run correctly, because their 
correctness is guaranteed by `redis-cli` on the premise that we ensure the 
correctness of the `scan` command.
Modified the scan command test to test for the cases where 
`redis-cursor-compatible` is set to `yes` or `no`.


### Other changes:
Fixed a bug where the cursor did not return 0 when `SCAN` commands return less 
than the number of elements.

Report URL: https://github.com/apache/kvrocks/actions/runs/5369596375

With regards,
GitHub Actions via GitBox

Reply via email to