[ https://issues.apache.org/jira/browse/KUDU-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998539#comment-16998539 ]
ASF subversion and git services commented on KUDU-3002: ------------------------------------------------------- Commit 111e54c0c8f254fb110919150c1f5d60dca37682 in kudu's branch refs/heads/master from Andrew Wong [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=111e54c ] KUDU-3002: prioritize WAL unanchoring when under memory pressure When under memory pressure, we currently prioritize performing the op that will free the most memory. In theory this seems reasonable -- if Kudu's using way too much memory, we should try to use less memory as quickly as possible. In practice, this meant that, while under memory pressure and under a constant, memory-consuming workload (e.g. inserts to the MRS), Kudu would starve operations that anchor relatively little memory (e.g. DMS flushes). This patch updates the behavior so that we prioritize operations that unanchor the most WAL bytes, breaking ties by prioritizing the ops that use more memory. This seems reasonable because: - Ops that anchor WALs also anchor memory. In performing an op that unanchor WALs, we are performing an op that frees memory, so we'll still prefer MRS and DMS flushing over compactions when under memory pressure. - We already use this heuristic when _not_ under memory pressure, but it is only used when the ops under consideration anchor too many WAL bytes (per --log_target_replay_size_mb), lending some credibility to it when used conditionally. - It becomes much more difficult to think of a scenario in which we're "stuck" using too much space for WALs or too much memory. Change-Id: Ibd85e8f2904a36b74cd6a3038c9ec49bb1ff9844 Reviewed-on: http://gerrit.cloudera.org:8080/14910 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo <a...@cloudera.com> > DMS may never flush when under memory pressure and there are on-going inserts > ----------------------------------------------------------------------------- > > Key: KUDU-3002 > URL: https://issues.apache.org/jira/browse/KUDU-3002 > Project: Kudu > Issue Type: Improvement > Components: perf, tablet > Reporter: Andrew Wong > Priority: Major > > When under memory pressure, we'll aggressively perform the maintenance > operation that frees the most memory. Right now, the only ops that register > memory are MRS and DMS flushes. > In practice, this means a couple things: > * In most cases, we'll prioritize flushing MRSs way ahead of flushing DMS, > since updates are spread across many DMSs and will therefore tend to be > small, whereas any non-trivial insert workload will well up into a single MRS > for an entire tablet > * We'll only flush a single DMS at a time to free memory. Because of this, > and because we'll likely prioritize MRS flushes over DMS flushes, we may end > up with a ton of tiny DMSs in a tablet that we'll never flush. This can end > up bloating the WALs because each DMS may be anchoring some WAL segments. > A couple thoughts on small things we can do to improve this: > * Register the DMS size as ram anchored by a compaction. This will meant > that we can schedule compactions to flush DMSs en masse. This would still > mean that we could end up always prioritizing MRS flushes, depending on how > quickly we're inserting. > * We currently register the amount disk space an LogGC would free up. We > could do something similar, but register how many log anchors an op could > release. This would be a bit trickier, since the log anchors aren't solely > determined by the mem-stores (e.g. we'll anchor segments to catch up slow > followers). > * Introduce a new op (or change the flush DMS op) that would flush as many > DMSs as we can for a given tablet. > Between these, the first seems like it'd be an easy win. -- This message was sent by Atlassian Jira (v8.3.4#803005)