[
https://issues.apache.org/jira/browse/IMPALA-14276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18012736#comment-18012736
]
ASF subversion and git services commented on IMPALA-14276:
----------------------------------------------------------
Commit 859c9c1f6666e3a62d827661a03d65700d11fc48 in impala's branch
refs/heads/master from Yida Wu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=859c9c1f6 ]
IMPALA-14276: Fix memory leak by removing AdmissionState on rejection
Normally, AdmissionState entries in admissiond are cleaned up when
a query is released. However, for requests that are rejected,
releasing query is not called, and their AdmissionState was not
removed from admission_state_map_ resulting in a memory leak over
time.
This leak was less noticeable because AdmissionState entries were
relatively small. However, when admissiond is run as a standalone
process, each AdmissionState includes a profile sidecar, which
can be large, making the leak much more.
This change adds logic to remove AdmissionState entries when the
admission request is rejected.
Testing:
Add test_admission_state_map_mem_leak for regression test.
Change-Id: I9fba4f176c648ed7811225f7f94c91342a724d10
Reviewed-on: http://gerrit.cloudera.org:8080/23257
Reviewed-by: Riza Suminto <[email protected]>
Reviewed-by: Abhishek Rawat <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Suspected memory leak when requests are rejected in standalone admissiond
> -------------------------------------------------------------------------
>
> Key: IMPALA-14276
> URL: https://issues.apache.org/jira/browse/IMPALA-14276
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Yida Wu
> Assignee: Yida Wu
> Priority: Major
>
> During testing, I observed that memory usage (tcmalloc.bytes-in-use) keeps
> increasing when requests are rejected or time out via the admissiond service.
> This behavior seems to occur only when admissiond is running as a standalone
> process. I did not observe the same pattern when admissiond is embedded
> within the impalad process.
> Repro:
> 1. Start the cluster with standalone admissiond:
> {code:java}
> $IMPALA_HOME/bin/start-impala-cluster.py
> --admissiond_args='--max_admission_queue_size=1000000
> --default_pool_max_queued=100000 --default_pool_max_requests=1'
> --impalad_args='--fe_service_threads=1024' --num_coordinators=2
> --enable_admission_service
> {code}
> 2. Run one long running query to hold the slot:
> {code:java}
> select sleep(150000000);
> {code}
> 3. Run the script to keep running queries, and they will time out:
> {code:java}
> #!/bin/bash
> MAX_RUNS=10
> for ((j = 1; j <= MAX_RUNS; j++)); do
> for i in {1..500}; do
> nohup $IMPALA_HOME/bin/impala-shell.sh -f test_sleep.sql > "out.log" 2>&1
> &
> sleep 0.2
> done
> disown -a
> sleep 60
> done
> {code}
> 4. Monitor memory usage. A steady increase over time like below. Given that
> all incoming requests are being rejected, we wouldn’t expect memory to grow
> like this, which suggests a possible memory leak.
> {code:java}
> tcmalloc.bytes-in-use 115.51 MB -> 142.04 MB -> 168.49 MB -> 225.71 MB
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]