aweisberg commented on code in PR #4692:
URL: https://github.com/apache/cassandra/pull/4692#discussion_r3053867085


##########
src/java/org/apache/cassandra/db/compaction/CompactionManager.java:
##########
@@ -795,22 +828,37 @@ public AllSSTableOpStatus performCleanup(final 
ColumnFamilyStore cfStore, int jo
 
         return parallelAllSSTableOperation(cfStore, new OneSSTableOperation()
         {
+            boolean incompleteOperation = false;
+
             @Override
             public Iterable<SSTableReader> filterSSTables(LifecycleTransaction 
transaction)
             {
                 List<SSTableReader> sortedSSTables = 
Lists.newArrayList(transaction.originals());
                 Iterator<SSTableReader> sstableIter = 
sortedSSTables.iterator();
                 int totalSSTables = 0;
                 int skippedSStables = 0;
+                int sstablesInUseByAccord = 0;
                 while (sstableIter.hasNext())
                 {
                     SSTableReader sstable = sstableIter.next();
                     boolean needsCleanupFull = needsCleanup(sstable, 
fullRanges);
                     boolean needsCleanupTransient = !transientRanges.isEmpty() 
&& sstable.isRepaired() && needsCleanup(sstable, transientRanges);
+                    totalSSTables++;
+
+                    if 
(sstableContainsRangesNeededByAccord(cfStore.getTableId(), sstable))

Review Comment:
   This entirely excludes sstables from cleanup assuming they are entirely 
needed by Accord when in reality it's just an intersecting section that is 
needed. Cleanup will actually rewrite sstables to remove the parts that can be 
cleaned up.
   
   Being unable to clean up an sstable doesn't represent an incomplete 
operation unless the table is also no longer owned? Just because Accord needs 
it doesn't mean we were going to clean it up. Unless I am missing something 
earlier about already having filtered the sstables to be the ones intersecting 
non-owned ranges. I'm not sure if we are iterating all or just some.



##########
src/java/org/apache/cassandra/db/compaction/CompactionManager.java:
##########
@@ -1599,6 +1670,31 @@ public static boolean needsCleanup(SSTableReader 
sstable, Collection<Range<Token
         return false;
     }
 
+    public static boolean sstableContainsRangesNeededByAccord(TableId tableId, 
SSTableReader sstable)
+    {
+        if (!AccordService.isSetup())
+            return false;
+
+        Future<List<Ranges>> futureAccordOwnedRanges = 
AccordService.toFuture(AccordService.instance().node().commandStores().getInUseRanges());

Review Comment:
   This is going to fetch the in use ranges for every single sstable of which 
there are thousands. It should probably fetch them once at the start and then 
keep referring to it. Also if the usage here is synchronous then the Accord 
method should also be synchronous or have a synchronous version you can call so 
that the boilerplate of getting the async value is handled inside the helper 
function.



##########
src/java/org/apache/cassandra/db/compaction/CompactionManager.java:
##########
@@ -612,7 +619,12 @@ public Object call() throws Exception
                 FBUtilities.waitOnFutures(futures);
                 assert compacting.originals().isEmpty();
                 logger.info("Finished {} for {}.{} successfully", 
operationType, keyspace, table);
-                return AllSSTableOpStatus.SUCCESSFUL;
+
+                // Some SSTables were not fully cleaned up because Accord is 
still using those ranges

Review Comment:
   This is `parallelAllSSTablesOperation` and cleanup is just one kind of thing 
it might do?



##########
src/java/org/apache/cassandra/db/compaction/CompactionManager.java:
##########
@@ -1599,6 +1670,31 @@ public static boolean needsCleanup(SSTableReader 
sstable, Collection<Range<Token
         return false;
     }
 
+    public static boolean sstableContainsRangesNeededByAccord(TableId tableId, 
SSTableReader sstable)
+    {
+        if (!AccordService.isSetup())
+            return false;
+
+        Future<List<Ranges>> futureAccordOwnedRanges = 
AccordService.toFuture(AccordService.instance().node().commandStores().getInUseRanges());
+        logger.info("Waiting for Accord to be ready");

Review Comment:
   Is this log message needed? Should it be debug? It will get printed A LOT.



##########
src/java/org/apache/cassandra/db/compaction/CompactionManager.java:
##########
@@ -1599,6 +1670,31 @@ public static boolean needsCleanup(SSTableReader 
sstable, Collection<Range<Token
         return false;
     }
 
+    public static boolean sstableContainsRangesNeededByAccord(TableId tableId, 
SSTableReader sstable)
+    {
+        if (!AccordService.isSetup())
+            return false;
+
+        Future<List<Ranges>> futureAccordOwnedRanges = 
AccordService.toFuture(AccordService.instance().node().commandStores().getInUseRanges());
+        logger.info("Waiting for Accord to be ready");
+
+        try
+        {
+            Ranges accordOwnedRanges = 
futureAccordOwnedRanges.get().stream().reduce(Ranges.EMPTY, (acc, r) -> 
acc.union(AbstractRanges.UnionMode.MERGE_ADJACENT, r));
+
+            if (sstable.getFirst().equals(sstable.getLast()))
+                return accordOwnedRanges.intersects(new TokenKey(tableId, 
sstable.getFirst().getToken()));
+
+            Ranges sstableRanges = Ranges.of(TokenRange.create(tableId, 
sstable.getFirst().getToken(), sstable.getLast().getToken()));

Review Comment:
   Do you actually need to wrap this in Ranges in order to test intersection?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to