[jira] [Updated] (GEODE-9513) optimize redis SCAN commands cursor type

Darrel Schneider (Jira) Tue, 17 Aug 2021 09:20:19 -0700


     [ 
https://issues.apache.org/jira/browse/GEODE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Darrel Schneider updated GEODE-9513:
------------------------------------
    Description: 
The geode SCAN commands implement the cursor with a BigInteger type. It seems 
like it could instead be a "long". The following are some discussions on slack 
about this:
 Darrel Schneider:
 Does anyone know why the redis HSCAN, SSCAN, and SCAN cursor in our 
implementation is a BigInteger? This seems like overkill. I would think in our 
implementation a 64-bit signed value would be big enough for the scan cursor.

Donal Evans
 I think it's to ensure that our input values match native Redis. Redis takes 
values in the range of BigInteger for cursor, even though they have the same 
constraints on the number of elements in their collections that we do

Darrel Schneider 
 The redis docs say:
{noformat}
Calling SCAN with a broken, negative, out of range, or otherwise invalid 
cursor, will result into undefined behavior but never into a crash. What will 
be undefined is that the guarantees about the returned elements can no longer 
be ensured by the SCAN implementation.
The only valid cursors to use are:
The cursor value of 0 when starting an iteration.
The cursor returned by the previous call to SCAN in order to continue the 
iteration.
{noformat}
Since we will never return a cursor larger than a signed 64-bit number is seems 
like if we see one come in we could immediately return an empty scan since that 
is what we will do anyway in our current implementation but we do a bunch of 
work to figure that out. Since we never return a cursor larger than 
Long.MAX_VALUE it seems like we could do some validation on the input and only 
if it looks like a valid cursor do a real scan. That would allow our scan impl 
to use a "long" instead of a BigInteger. Since BigInteger is immutable 
incrementing it produces a bunch of garbage. So it seems like we could be 
compatible with native redis without using a BigInteger all the time. Let me 
know if I'm missing something. 

Donal Evans
 I think that's a good idea, but we'd probably still have to create a 
BigInteger at least once, to cover all the bases. There are three cases we have 
to deal with here, I think: 1. the cursor is a valid positive Long, in which 
case we parse it and use it. 2. the cursor is a positive integer larger than 
Long.MAX_VALUE but smaller than the capacity of an unsigned long (which is the 
largest value Redis accepts) in which case we just return an empty scan 3. the 
cursor is negative, larger than the capacity of an unsigned long or not an 
integer, in which case we return an error. Telling the difference between cases 
2 and 3 is where we might run into trouble, since we've been trying to exactly 
mirror Redis' behaviour when it comes to errors

  was:
The geode SCAN commands implement the cursor with a BigInteger type. It seems 
like it could instead be a "long". The following are some discussions on slack 
about this:
Darrel Schneider:
Does anyone know why the redis HSCAN, SSCAN, and SCAN cursor in our 
implementation is a BigInteger? This seems like overkill. I would think in our 
implementation a 64-bit signed value would be big enough for the scan cursor.

Donal Evans
I think it's to ensure that our input values match native Redis. Redis takes 
values in the range of BigInteger for cursor, even though they have the same 
constraints on the number of elements in their collections that we do

Darrel Schneider 
The redis docs say: 
{noformat}
Calling SCAN with a broken, negative, out of range, or otherwise invalid 
cursor, will result into undefined behavior but never into a crash. What will 
be undefined is that the guarantees about the returned elements can no longer 
be ensured by the SCAN implementation.
The only valid cursors to use are:
The cursor value of 0 when starting an iteration.
The cursor returned by the previous call to SCAN in order to continue the 
iteration.
{noformat}
Since we will never return a cursor larger than a signed 64-bit number is seems 
like if we see one come in we could immediately return an empty scan since that 
is what we will do anyway in our current implementation but we do a bunch of 
work to figure that out. Since we never return a cursor larger than 
Long.MAX_VALUE it seems like we could do some validation on the input and only 
if it looks like a valid cursor do a real scan. That would allow our scan impl 
to use a "long" instead of a BigInteger. Since BigInteger is immutable 
incrementing it produces a bunch of garbage. So it seems like we could be 
compatible with native redis without using a BigInteger all the time. Let me 
know if I'm missing something. (edited) 
New

Donal Evans
I think that's a good idea, but we'd probably still have to create a BigInteger 
at least once, to cover all the bases. There are three cases we have to deal 
with here, I think: 1. the cursor is a valid positive Long, in which case we 
parse it and use it. 2. the cursor is a positive integer larger than 
Long.MAX_VALUE but smaller than the capacity of an unsigned long (which is the 
largest value Redis accepts) in which case we just return an empty scan 3. the 
cursor is negative, larger than the capacity of an unsigned long or not an 
integer, in which case we return an error. Telling the difference between cases 
2 and 3 is where we might run into trouble, since we've been trying to exactly 
mirror Redis' behaviour when it comes to errors



> optimize redis SCAN commands cursor type
> ----------------------------------------
>
>                 Key: GEODE-9513
>                 URL: https://issues.apache.org/jira/browse/GEODE-9513
>             Project: Geode
>          Issue Type: Improvement
>          Components: redis
>            Reporter: Darrel Schneider
>            Priority: Major
>
> The geode SCAN commands implement the cursor with a BigInteger type. It seems 
> like it could instead be a "long". The following are some discussions on 
> slack about this:
>  Darrel Schneider:
>  Does anyone know why the redis HSCAN, SSCAN, and SCAN cursor in our 
> implementation is a BigInteger? This seems like overkill. I would think in 
> our implementation a 64-bit signed value would be big enough for the scan 
> cursor.
> Donal Evans
>  I think it's to ensure that our input values match native Redis. Redis takes 
> values in the range of BigInteger for cursor, even though they have the same 
> constraints on the number of elements in their collections that we do
> Darrel Schneider 
>  The redis docs say:
> {noformat}
> Calling SCAN with a broken, negative, out of range, or otherwise invalid 
> cursor, will result into undefined behavior but never into a crash. What will 
> be undefined is that the guarantees about the returned elements can no longer 
> be ensured by the SCAN implementation.
> The only valid cursors to use are:
> The cursor value of 0 when starting an iteration.
> The cursor returned by the previous call to SCAN in order to continue the 
> iteration.
> {noformat}
> Since we will never return a cursor larger than a signed 64-bit number is 
> seems like if we see one come in we could immediately return an empty scan 
> since that is what we will do anyway in our current implementation but we do 
> a bunch of work to figure that out. Since we never return a cursor larger 
> than Long.MAX_VALUE it seems like we could do some validation on the input 
> and only if it looks like a valid cursor do a real scan. That would allow our 
> scan impl to use a "long" instead of a BigInteger. Since BigInteger is 
> immutable incrementing it produces a bunch of garbage. So it seems like we 
> could be compatible with native redis without using a BigInteger all the 
> time. Let me know if I'm missing something. 
> Donal Evans
>  I think that's a good idea, but we'd probably still have to create a 
> BigInteger at least once, to cover all the bases. There are three cases we 
> have to deal with here, I think: 1. the cursor is a valid positive Long, in 
> which case we parse it and use it. 2. the cursor is a positive integer larger 
> than Long.MAX_VALUE but smaller than the capacity of an unsigned long (which 
> is the largest value Redis accepts) in which case we just return an empty 
> scan 3. the cursor is negative, larger than the capacity of an unsigned long 
> or not an integer, in which case we return an error. Telling the difference 
> between cases 2 and 3 is where we might run into trouble, since we've been 
> trying to exactly mirror Redis' behaviour when it comes to errors



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-9513) optimize redis SCAN commands cursor type

Reply via email to