Hi all,
We have done some more research on c14227. The current patch for
CASSANDRA-14227 solves the TTL limit issue by switching TTL to
long instead of int. This approach does not have a negative impact
on memtable memory usage, as C* controles the memory used by the
Memtable, but based on our testing it increases the bytes flushed
by 4 to 7% and the byte on disk by 2 to 3%.
As a mitigation to this problem it is possible to encode localDeletionTime as a vint. It
results in a 1% improvement but might cause additional
computations during compaction or some other operations.
Benedict's proposal to keep on using ints for TTL but as a delta
to nowInSecond would work for memtables but not for work in the
SSTable where nowInSecond does not exist. By consequence we would
still suffer from the impact on byte flushed and bytes on disk.
Another approach that was suggested is the use of unsigned
integer. Java 8 has an unsigned integer API that would allow us to
use unsigned int for TTLs. Based on computation unsigned ints
would give us a maximum time of 136 years since the Unix Epoch and
therefore a maximum expiration timestamp in 2106. We would have to
keep TTL at 20y instead of 68y to give us enough breathing room
though, otherwise in 2035 we'd hit the same problem again.
Happy to hear opinions.
On 18/10/22 10:56, Berenguer Blasi
wrote:
Hi,
apologies for the late reply as I have been OOO. I have done
some profiling and results look virtually identical on trunk and
14227. I have attached some screenshots to the ticket https://issues.apache.org/jira/browse/CASSANDRA-14227.
Unless my eyes are fooling me everything in the jfrs look the
same.
Regards
On 30/9/22 9:44, Berenguer Blasi
wrote:
Hi Benedict,
thanks for the reply! Yes some profiling is probably needed,
then we can see if going down the delta encoding big refactor
rabbit hole is worth it?
Let's see what other concerns people bring up.
Thx.
On 29/9/22 11:12, Benedict Elliott
Smith wrote:
My only slight concern with this approach is
the additional memory pressure. Since 64yrs should be plenty
at any moment in time, I wonder if it wouldn’t be better to
represent these times as deltas from the nowInSec being used
to process the query. So, long math would only be used to
normalise the times to this nowInSec (from whatever is
stored in the sstable) within a method, and ints would be
stored in memtables and any objects used for processing.
This
might admittedly be more work, but I don’t believe it
should be too challenging - we can introduce a method
deletionTime(int nowInSec) that returns a long value by
adding nowInSec to the deletionTime, and make the
underlying value private, refactoring call sites?
Hi all,
I have taken a stab in a PR you can find attached in
the ticket. Mainly:
- I have moved deletion times, gc and nowInSec
timestamps to long. That should get us past the 2038
limit.
- TTL is maxed now to 68y. Think CQL API
compatibility and a sort of a 'free' guardrail.
- A new NONE overflow policy is the default but
everything is backwards compatible by keeping the
previous ones in place. Think upgrade scenarios or
apps relying on the previous behavior.
- The new limit is around year 292,471,208,677 which
sounds ok given the Sun will start collapsing in 3
to 5 billion years :-)
- Please feel free to drop by the ticket and take a
look at the PR even if it's cursory
Thx in advance.