I am on the same page as Enrico: I don't have much experience with RocksDB;
as you can see from discussion on the PR
https://github.com/apache/bookkeeper/pull/2686#discussion_r613468033
the perf impact was a concern, OTOH the PR is fixing a perf issue a seek
time.

I searched and found this guide:
https://github.com/EighteenZi/rocksdb_wiki/blob/master/RocksDB-Tuning-Guide.md
Is it possible that tuning max_background_compactions and some other
parameters can help?
This PR https://github.com/apache/bookkeeper/pull/3056 made the RocksDb
tuning easier.

I'll help with review of a PR (hopefully supplemented with perf tests
results) but I cannot commit to fixing it.
I hope you and Maurice (author of the original PR) can find a workable
compromise.

On Tue, Mar 15, 2022 at 7:14 AM Enrico Olivelli <eolive...@gmail.com> wrote:

> Hang,
>
> Il giorno mar 15 mar 2022 alle ore 02:47 Hang Chen
> <chenh...@apache.org> ha scritto:
> >
> > Hi BookKeeper Community,
> >
> >   For BookKeeper 4.14.0+, I have noticed that index deletion is
> > sometimes taking around 60 seconds which cause the CPU to spike to
> > 100%
> > ```
> > [2022-02-28T07:25:42.531Z] INFO db-storage-cleanup-10-1
> > EntryLocationIndex:191 Deleting indexes for ledgers: [3385184,
> > 3385239, 3385159, 3385142, 3385124, 3385193, 3384879, 3385165,
> > 3385916]
> > [2022-02-28T07:26:34.089Z] INFO db-storage-cleanup-10-1
> > EntryLocationIndex:266 Deleted indexes for 201065 entries from 9
> > ledgers in 51.557 seconds
> > [2022-02-28T07:40:42.534Z] INFO db-storage-cleanup-10-1
> > EntryLocationIndex:191 Deleting indexes for ledgers: [3385379,
> > 3385367, 3385718, 3385365, 3385412, 3385167, 3385357, 3386141]
> > [2022-02-28T07:41:47.867Z] INFO db-storage-cleanup-10-1
> > EntryLocationIndex:266 Deleted indexes for 134590 entries from 8
> > ledgers in 65.332 seconds
> > ```
> >
> > RocksDB compaction is a heavy operation and the checkpoint will be
> > triggered in high frequency, which causes db-storage-cleanup thread
> > always into high load, and makes the cpu keep 100%.
> >
> > This change was introduced by
> > https://github.com/apache/bookkeeper/pull/2686, The motivation of this
> > Pr is:
> >
> > > After deleting many ledgers, seeking to the end of the RocksDB
> metadata can take a long time and trigger timeouts upstream. Address this
> by improving the seek logic as well as compacting out tombstones in
> situations where we've just deleted many entries. This affects the entry
> location index and the ledger metadata index.
> >
> > For RocksDB, the CompactRange operation is a high overload operation.
> > we'd better avoid manual calls. Since RocksDB 7.0, the `compactRange`
> > API has been removed.
> > https://github.com/facebook/rocksdb/pull/9444
> >
> > IMO, we'd better remove the manual call compactRange in this PR, and
> > increase the `max_background_jobs` to accelerate auto compaction.
> >
> > Would you please give me more ideas?
> I don't have much experience with RocksDB.
>
> Did you make a prototype ?
>
> Sharing some results in a prototype would help a lot.
>
> I am not sure, but maybe we can add a option to enable/disable manual
> compaction and to tune max_background_jobs
> this way we can rollback in case of problems with your proposal
>
> Enrico
>
> >
> > Thanks,
> > Hang
>


-- 
Andrey Yegorov

Reply via email to