[
https://issues.apache.org/jira/browse/CASSANDRA-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18028261#comment-18028261
]
Yifan Cai commented on CASSANDRA-17021:
---------------------------------------
Logging the sampling status sounds good. I will address it.
bq. just flush right away when starting training
I am a bit concerned about the behavior, mainly from the perspective of single
responsibility. Meanwhile, I do realize that 'flush' is almost required when
training manually. The 'traincompressiondictionary' command currently suggests
running 'flush' along with it. Maybe we can have a command option that enables
auto-flush with a specified interval. Default is off. So operators have the
flexibility of running flush manually or automatically along with the command.
How does it sound to you?
> Enhance Zstd support in Cassandra with dictionaries
> ---------------------------------------------------
>
> Key: CASSANDRA-17021
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17021
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Feature/Compression
> Reporter: Dinesh Joshi
> Assignee: Yifan Cai
> Priority: Normal
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
> Currently Cassandra supports zstd compression. However, Zstd also supports
> dictionaries to enhance not only the compression ratio but also the speed.
> Dictionaries can show 3-4x savings. We should add support to train
> dictionaries, ideally per SSTable this will yield the maximum gains.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]