We haven't experienced anything like that up to 2.8. We are still in the process of stabilizing 2.10 as we upgrade some of the bigger clusters. We will know soon how 2.10 datanodes behave under heavy load and storage utilization.
If you are seeing a significant change, it might be something post-2.8 or even post-2.10. Kihwal On Tue, Oct 6, 2020 at 5:09 PM Wei-Chiu Chuang <weic...@cloudera.com> wrote: > Sorry for not being specific. > I was referring to HDFS-8791 > <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_HDFS-2D8791&d=DwMFaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=dAJ657NT-13Zjdb3zsUQxFoymNFB0SJd_2OTmE5mCR4&m=M36liML4Z0UBfc0vLFzg_C0fN_jTaH_ZbUGM_0Mnwjo&s=ukaowpvXdF0_o7i-UHB4046_L5Qyd0ZkEP9D778DM9c&e=> > (block > ID-based DN storage layout can be very slow for datanode on ext4) where it > is in 2.8 and above. > > As I understand it, the increased heap usage only occurs during upgrade. > No issue afterwards. > > My experience was based on CDH5 to CDH6 upgrade (Hadoop 2.6 -> Hadoop 3.0) > and HDP2 to HDP3 (Hadoop 2.7 -> Hadoop 3.1) upgrade. It is nearly > impossible to tell which commit increases heap usage worse during upgrade. > > > > On Tue, Oct 6, 2020 at 3:01 PM Kihwal Lee <kih...@verizonmedia.com> wrote: > >> Which layout change are you referring to? The only layout change I know >> of was done in 2.7, IIRC. We backported that to 2.6 and did not see any >> adverse effects at that time. >> >> Is datanode using more heap all the time? Or is it running into trouble >> when generating full block reports? >> >> Kihwal >> >> On Mon, Oct 5, 2020 at 1:40 PM Wei-Chiu Chuang >> <weic...@cloudera.com.invalid> wrote: >> >>> We experienced this issue on CDH6 and HDP3, so roughly Hadoop 3.0.x and >>> 3.1.x. >>> Hermanth experienced the same issue on Hadoop 3.1.1 as well (HDFS-15569 >>> < >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_HDFS-2D15569&d=DwIBaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=b6gUZYewojO-9YMJdyeI_g&m=itpohwgKPN5qoauYyyMxhGSnasaP3LLbbMVezETEenA&s=kgWYVv2utuAyPWBhv0KVH8ZZGJqQBMvUM7dZ8J0jaa8&e= >>> >) >>> >>> On Mon, Oct 5, 2020 at 11:03 AM Igor Dvorzhak <i...@google.com> wrote: >>> >>> > What Hadoop 3 version do you use? >>> > >>> > On Mon, Oct 5, 2020 at 10:03 AM Wei-Chiu Chuang <weic...@apache.org> >>> > wrote: >>> > >>> >> I have anecdotally learned of multiple data points where during the >>> >> upgrading from Hadoop 2 to Hadoop 3, DN heap usage increases to the >>> point >>> >> where it goes OOM. >>> >> >>> >> Don't have much logs for this issue, but I suspect it's caused by the >>> >> layout change added in Hadoop 2.8.0. >>> >> >>> >> Does anyone else observe the same issue and how do you mitigate this? >>> For >>> >> now we suggested increasing DN heap size prior to upgrade as part of >>> >> pre-upgrade checklist. >>> >> >>> >> Thanks, >>> >> Wei-Chiu >>> >> >>> > >>> >>