Hi all,
We have done some more research on c14227. The current patch for
CASSANDRA-14227 solves the TTL limit issue by switching TTL to long
instead of int. This approach does not have a negative impact on
memtable memory usage, as C* controles the memory used by the Memtable,
but based on our testing it increases the bytes flushed by 4 to 7% and
the byte on disk by 2 to 3%.
As a mitigation to this problem it is possible to encode
/localDeletionTime/ as a vint. It results in a 1% improvement but might
cause additional computations during compaction or some other operations.
Benedict's proposal to keep on using ints for TTL but as a delta to
nowInSecond would work for memtables but not for work in the SSTable
where nowInSecond does not exist. By consequence we would still suffer
from the impact on byte flushed and bytes on disk.
Another approach that was suggested is the use of unsigned integer. Java
8 has an unsigned integer API that would allow us to use unsigned int
for TTLs. Based on computation unsigned ints would give us a maximum
time of 136 years since the Unix Epoch and therefore a maximum
expiration timestamp in 2106. We would have to keep TTL at 20y instead
of 68y to give us enough breathing room though, otherwise in 2035 we'd
hit the same problem again.
Happy to hear opinions.
On 18/10/22 10:56, Berenguer Blasi wrote:
Hi,
apologies for the late reply as I have been OOO. I have done some
profiling and results look virtually identical on trunk and 14227. I
have attached some screenshots to the ticket
https://issues.apache.org/jira/browse/CASSANDRA-14227. Unless my eyes
are fooling me everything in the jfrs look the same.
Regards
On 30/9/22 9:44, Berenguer Blasi wrote:
Hi Benedict,
thanks for the reply! Yes some profiling is probably needed, then we
can see if going down the delta encoding big refactor rabbit hole is
worth it?
Let's see what other concerns people bring up.
Thx.
On 29/9/22 11:12, Benedict Elliott Smith wrote:
My only slight concern with this approach is the additional memory
pressure. Since 64yrs should be plenty at any moment in time, I
wonder if it wouldn’t be better to represent these times as deltas
from the nowInSec being used to process the query. So, long math
would only be used to normalise the times to this nowInSec (from
whatever is stored in the sstable) within a method, and ints would
be stored in memtables and any objects used for processing.
This might admittedly be more work, but I don’t believe it should be
too challenging - we can introduce a method deletionTime(int
nowInSec) that returns a long value by adding nowInSec to the
deletionTime, and make the underlying value private, refactoring
call sites?
On 29 Sep 2022, at 09:37, Berenguer Blasi
<berenguerbl...@gmail.com> wrote:
Hi all,
I have taken a stab in a PR you can find attached in the ticket.
Mainly:
- I have moved deletion times, gc and nowInSec timestamps to long.
That should get us past the 2038 limit.
- TTL is maxed now to 68y. Think CQL API compatibility and a sort
of a 'free' guardrail.
- A new NONE overflow policy is the default but everything is
backwards compatible by keeping the previous ones in place. Think
upgrade scenarios or apps relying on the previous behavior.
- The new limit is around year 292,471,208,677 which sounds ok
given the Sun will start collapsing in 3 to 5 billion years :-)
- Please feel free to drop by the ticket and take a look at the PR
even if it's cursory
Thx in advance.