[jira] [Created] (FLINK-33946) RocksDb sets setAvoidFlushDuringShutdown to true to speed up Task Cancel

Yue Ma (Jira) Tue, 26 Dec 2023 19:25:19 -0800

Yue Ma created FLINK-33946:
------------------------------

             Summary: RocksDb sets setAvoidFlushDuringShutdown to true to speed 
up Task Cancel
                 Key: FLINK-33946
                 URL: https://issues.apache.org/jira/browse/FLINK-33946
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / State Backends
    Affects Versions: 1.19.0
            Reporter: Yue Ma
             Fix For: 1.19.0



When a Job fails, the task needs to be canceled and re-deployed. 
RocksDBStatebackend will call RocksDB.close when disposing.


{code:java}
if (!shutting_down_.load(std::memory_order_acquire) &&
    has_unpersisted_data_.load(std::memory_order_relaxed) &&
    !mutable_db_options_.avoid_flush_during_shutdown) {
  if (immutable_db_options_.atomic_flush) {
    autovector<ColumnFamilyData*> cfds;
    SelectColumnFamiliesForAtomicFlush(&cfds);
    mutex_.Unlock();
    Status s =
        AtomicFlushMemTables(cfds, FlushOptions(), FlushReason::kShutDown);
    s.PermitUncheckedError();  //**TODO: What to do on error?
    mutex_.Lock();
  } else {
    for (auto cfd : *versions_->GetColumnFamilySet()) {
      if (!cfd->IsDropped() && cfd->initialized() && !cfd->mem()->IsEmpty()) {
        cfd->Ref();
        mutex_.Unlock();
        Status s = FlushMemTable(cfd, FlushOptions(), FlushReason::kShutDown);
        s.PermitUncheckedError();  //**TODO: What to do on error?
        mutex_.Lock();
        cfd->UnrefAndTryDelete();
      }
    }
  } {code}


By default (avoid_flush_during_shutdown=false) RocksDb requires FlushMemtable 
when Close. When the disk pressure is high or the Memtable is large, this 
process will be more time-consuming, which will cause the Task to get stuck in 
the Canceling stage and affect the speed of job Failover.
In fact, it is completely unnecessary to Flush memtable when Flink Task is 
Close, because the data can be replayed from Checkpoint. So we can set 
avoid_flush_during_shutdown to true to speed up Task Failover



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-33946) RocksDb sets setAvoidFlushDuringShutdown to true to speed up Task Cancel

Reply via email to