Sahil Takiar created HDFS-14304: ----------------------------------- Summary: High lock contention on hdfsHashMutex in libhdfs Key: HDFS-14304 URL: https://issues.apache.org/jira/browse/HDFS-14304 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, libhdfs, native Reporter: Sahil Takiar Assignee: Sahil Takiar
While doing some performance profiling of an application using libhdfs, we noticed a high amount of lock contention on the {{hdfsHashMutex}} defined in {{hadoop-hdfs-native-client/src/main/native/libhdfs/os/mutexes.h}} The issue is that every JNI method invocation done by {{hdfs.c}} goes through a helper method called {{invokeMethod}}. {{invokeMethod}} calls {{globalClassReference}} which acquires {{hdfsHashMutex}} while performing a lookup in a {{htable}} (a custom hash table that lives in {{libhdfs/common}}) (the lock is acquired for both reads and writes). The hash table maps {{char *className}} to {{jclass}} objects, it seems the goal of the hash table is to avoid repeatedly creating {{jclass}} objects for each JNI call. For multi-threaded applications, this lock severely limits that rate at which Java methods can be invoked. pstacks show a lot of time being spent on {{hdfsHashMutex}} {code:java} #0 0x00007fba2dbc242d in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007fba2dbbddcb in _L_lock_812 () from /lib64/libpthread.so.0 #2 0x00007fba2dbbdc98 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00000000027d8386 in mutexLock () #4 0x00000000027d0e7b in globalClassReference () #5 0x00000000027d1160 in invokeMethod () #6 0x00000000027d4176 in readDirect () #7 0x00000000027d4325 in hdfsRead () {code} Same with {{perf report}} {code:java} + 63.36% 0.01% [k] system_call_fastpath + 61.60% 0.12% [k] sys_futex + 61.45% 0.13% [k] do_futex + 57.54% 0.49% [k] _raw_qspin_lock + 57.07% 0.01% [k] queued_spin_lock_slowpath + 55.47% 55.47% [k] native_queued_spin_lock_slowpath - 35.68% 0.00% [k] 0x6f6f6461682f6568 - 0x6f6f6461682f6568 - 30.55% __lll_lock_wait - 29.40% system_call_fastpath - 29.39% sys_futex - 29.35% do_futex - 29.27% futex_wait - 28.17% futex_wait_setup - 27.05% _raw_qspin_lock - 27.05% queued_spin_lock_slowpath 26.30% native_queued_spin_lock_slowpath + 0.67% ret_from_intr + 0.71% futex_wait_queue_me - 2.00% methodIdFromClass - 1.94% jni_GetMethodID - 1.71% get_method_id 0.96% SymbolTable::lookup_only - 1.61% invokeMethod - 0.62% jni_CallLongMethodV 0.52% jni_invoke_nonstatic 0.75% pthread_mutex_lock {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org