[ https://issues.apache.org/jira/browse/CASSANDRA-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854876#comment-17854876 ]
Stefan Miklosovic edited comment on CASSANDRA-18111 at 6/13/24 11:25 PM: ------------------------------------------------------------------------- https://github.com/apache/cassandra/pull/3374 I went with WatchService approach. How it works is that when somebody deletes snapshots from disk by hand, this will be detected and it will be removed from SnapshotManager hence it will not be visible among nodetool listsnapshots nor in system_views.snapshots. At the same time, we are not going to the disk when we list snapshots (nodetool / jmx / cql). We will ever go to disk only when we 1) start a node and load it all 2) remove a snapshot. Listing is io-free. cc [~rustyrazorblade] [~jjirsa] to let you know when it comes to IO as we were recently dealing with hints if you remember. I am trying to not go to the disk while listing either. was (Author: smiklosovic): https://github.com/apache/cassandra/pull/3374 I went with WatchService approach. How it works is that when somebody delete snapshots from disk by hand, this will be detected and it will be removed from SnapshotManager hence it will not be visible among nodetool listsnapshots nor in system_views.snapshots. At the same time, we are not going to the disk when we list snapshots (nodetool / jmx / cql). We will ever go to disk only when we 1) start node and load it all 2) remove a snapshot. Listing is io-free. cc [~rustyrazorblade] [~jjirsa] to let you know when it comes to IO as we were recently dealing with hints if you remember. I am trying to not to go to disk with snapshot at all either. > Cache snapshots in memory > ------------------------- > > Key: CASSANDRA-18111 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18111 > Project: Cassandra > Issue Type: Improvement > Components: Local/Snapshots > Reporter: Paulo Motta > Assignee: Stefan Miklosovic > Priority: Normal > Fix For: 5.x > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Everytime {{nodetool listsnapshots}} is called, all data directories are > scanned to find snapshots, what is inefficient. > For example, fetching the > {{org.apache.cassandra.metrics:type=ColumnFamily,name=SnapshotsSize}} metric > can take half a second (CASSANDRA-13338). > This improvement will also allow snapshots to be efficiently queried via > virtual tables (CASSANDRA-18102). > In order to do this, we should: > a) load all snapshots from disk during initialization > b) keep a collection of snapshots on {{SnapshotManager}} > c) update the snapshots collection anytime a new snapshot is taken or cleared > d) detect when a snapshot is manually removed from disk. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org