[ https://issues.apache.org/jira/browse/KUDU-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995872#comment-16995872 ]
Andrew Wong commented on KUDU-3002: ----------------------------------- With all of these approaches, it's pretty easy to come up with scenarios in which they'd fall prey to the same issue – if the MRS is much larger than the deltas, we may not perform that op that unanchors the WAL segments. For the sake of unanchoring WALs, we may be better off just considering the retained bytes of each MRS and DMS when choosing what to perform, even if it means we'll be using more memory temporarily. > DMS may never flush when under memory pressure and there are on-going inserts > ----------------------------------------------------------------------------- > > Key: KUDU-3002 > URL: https://issues.apache.org/jira/browse/KUDU-3002 > Project: Kudu > Issue Type: Improvement > Components: perf, tablet > Reporter: Andrew Wong > Priority: Major > > When under memory pressure, we'll aggressively perform the maintenance > operation that frees the most memory. Right now, the only ops that register > memory are MRS and DMS flushes. > In practice, this means a couple things: > * In most cases, we'll prioritize flushing MRSs way ahead of flushing DMS, > since updates are spread across many DMSs and will therefore tend to be > small, whereas any non-trivial insert workload will well up into a single MRS > for an entire tablet > * We'll only flush a single DMS at a time to free memory. Because of this, > and because we'll likely prioritize MRS flushes over DMS flushes, we may end > up with a ton of tiny DMSs in a tablet that we'll never flush. This can end > up bloating the WALs because each DMS may be anchoring some WAL segments. > A couple thoughts on small things we can do to improve this: > * Register the DMS size as ram anchored by a compaction. This will meant > that we can schedule compactions to flush DMSs en masse. This would still > mean that we could end up always prioritizing MRS flushes, depending on how > quickly we're inserting. > * We currently register the amount disk space an LogGC would free up. We > could do something similar, but register how many log anchors an op could > release. This would be a bit trickier, since the log anchors aren't solely > determined by the mem-stores (e.g. we'll anchor segments to catch up slow > followers). > * Introduce a new op (or change the flush DMS op) that would flush as many > DMSs as we can for a given tablet. > Between these, the first seems like it'd be an easy win. -- This message was sent by Atlassian Jira (v8.3.4#803005)