[ https://issues.apache.org/jira/browse/KAFKA-9729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Manikumar resolved KAFKA-9729. ------------------------------ Resolution: Fixed > Shrink inWriteLock time in SimpleAuthorizer > ------------------------------------------- > > Key: KAFKA-9729 > URL: https://issues.apache.org/jira/browse/KAFKA-9729 > Project: Kafka > Issue Type: Improvement > Components: security > Affects Versions: 1.1.0 > Reporter: Jiao Zhang > Assignee: Lucas Bradstreet > Priority: Minor > > Current SimpleAuthorizer needs 'inWriteLock' when processing add/remove acls > requests, while getAcls in authorize() needs 'inReadLock'. > That means handling add/remove acls requests would block all other requests > for example produce and fetch requests. > When processing add/remove acls, updateResourceAcls() access zk to update > acls, which could be long in the case like network glitch. > We did the simulation for zk delay. > When adding 100ms delay on zk side, 'inWriteLock' in addAcls()/removeAcls > lasts for 400ms~500ms. > When adding 500ms delay on zk side, 'inWriteLock' in addAcls()/removeAcls > lasts for 2000ms~2500ms. > {code:java} > override def addAcls(acls: Set[Acl], resource: Resource) { > if (acls != null && acls.nonEmpty) { > inWriteLock(lock) { > val startMs = Time.SYSTEM.milliseconds() > updateResourceAcls(resource) { currentAcls => > currentAcls ++ acls > } > warn(s"inWriteLock in addAcls consumes ${Time.SYSTEM.milliseconds() - > startMs} milliseconds.") > } > } > }{code} > Blocking produce/fetch requests for 2s would cause apparent performance > degradation for the whole cluster. > So considering is it possible to remove 'inWriteLock' in addAcls/removeAcls > and only put 'inWriteLock' inside updateCache, which is called by > addAcls/removeAcls. > {code:java} > // code placeholder > private def updateCache(resource: Resource, versionedAcls: VersionedAcls) { > if (versionedAcls.acls.nonEmpty) { > aclCache.put(resource, versionedAcls) > } else { > aclCache.remove(resource) > } > } > {code} > If do this, block time is only the time for updating local cache, which isn't > influenced by network glitch. But don't know if there were special concerns > to have current strict write lock and not sure if there are side effects if > only put lock to updateCache. > Btw, the latest version uses 'inWriteLock' at same places as version 1.1.0. -- This message was sent by Atlassian Jira (v8.3.4#803005)