[ 
https://issues.apache.org/jira/browse/KUDU-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845995#comment-17845995
 ] 

ASF subversion and git services commented on KUDU-3568:
-------------------------------------------------------

Commit b607633fd3c2b676fbd2cbe57c44bddf818dc457 in kudu's branch 
refs/heads/master from Ashwani Raina
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=b607633fd ]

KUDU-3568 Fix compaction budgeting test by setting memory hard limit

TestRowSetCompactionSkipWithBudgetingConstraints can fail if the
memory on node running the test is high. It happens because the test
generates deltas of size worth a few MBs that is multiplied with a
preset factor to ensure the result (i.e. memory required for rowset
compaction completion) is of high value of the order of 200 GB per
rowset.

Even though nodes running the test generally don't have so much
physical memory, it is still possible to end up with high memory nodes.
On such nodes, the test might fail.

The patch fixes that problem by deterministically ensuring that
compaction memory requirement is always higher than the memory hard
limit. It does that by doing the following:
1. Move out the budgeting compaction tests out in a separate binary.
2. This gives flexibility to set the memory hard limit as per test
   needs. It is important to node that once a memory hard limit is
   set, it remains the same for all tests executed through
   binary lifecycle.
3. Set the hard memory limit to 1 GB which is enough to handle compaction
   requirements for TestRowSetCompactionProceedWithNoBudgetingConstraints.
   For TestRowSetCompactionSkipWithBudgetingConstraints, it is not
   enough because we set the delta memory factor high to exceed 1 GB.
   Both the test are now expected to succeed deterministically.

Change-Id: I85d104e1d066507ce8e72a00cc5165cc4b85e48d
Reviewed-on: http://gerrit.cloudera.org:8080/21416
Tested-by: Alexey Serbin <ale...@apache.org>
Reviewed-by: Alexey Serbin <ale...@apache.org>


> TestRowSetCompactionSkipWithBudgetingConstraints fails when run on some nodes
> -----------------------------------------------------------------------------
>
>                 Key: KUDU-3568
>                 URL: https://issues.apache.org/jira/browse/KUDU-3568
>             Project: Kudu
>          Issue Type: Bug
>    Affects Versions: 1.18.0
>            Reporter: Alexey Serbin
>            Assignee: Ashwani Raina
>            Priority: Major
>         Attachments: test-failure.log.xz
>
>
> The {{TestCompaction.TestRowSetCompactionSkipWithBudgetingConstraints}} 
> scenario fails with the error like below when run on a machine with 
> relatively high memory (it might be just a Docker instance with tiny actual 
> memory allocated, but having the access to the {{/proc}} filesystem of the 
> host machine).  The full test log is attached.
> {noformat}
> src/kudu/tablet/compaction-test.cc:908: Failure
> Value of: JoinStrings(sink.logged_msgs(), "\n")
> Expected: has substring "removed from compaction input due to memory 
> constraints"
>   Actual: "I20240425 10:13:05.497732 3573764 compaction-test.cc:902] 
> CompactRowSetsOp complete. Timing: real 0.673s\tuser 0.669s\tsys 0.004s 
> Metrics: 
> {\"bytes_written\":4817,\"cfile_cache_hit\":90,\"cfile_cache_hit_bytes\":4310,\"cfile_cache_miss\":330,\"cfile_cache_miss_bytes\":3794180,\"cfile_init\":41,\"delta_iterators_relevant\":40,\"dirs.queue_time_us\":503,\"dirs.run_cpu_time_us\":338,\"dirs.run_wall_time_us\":1780,\"drs_written\":1,\"lbm_read_time_us\":1951,\"lbm_reads_lt_1ms\":494,\"lbm_write_time_us\":1767,\"lbm_writes_lt_1ms\":132,\"mutex_wait_us\":189,\"num_input_rowsets\":10,\"peak_mem_usage\":2147727,\"rows_written\":20,\"thread_start_us\":242,\"threads_started\":5}"
>  (of type std::string)
> {noformat}
> For extra information, below is 10 lines from {{/proc/meminfo}} file on a 
> node where the test failed:
> {noformat}
> # cat /proc/meminfo  | head -10
> MemTotal:       527417196 kB
> MemFree:        96640684 kB
> MemAvailable:   363590980 kB
> Buffers:        15352304 kB
> Cached:         246687576 kB
> SwapCached:      1294016 kB
> Active:         214889608 kB
> Inactive:       189745504 kB
> Active(anon):   133110648 kB
> Inactive(anon): 16977280 kB
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to