[ https://issues.apache.org/jira/browse/KUDU-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282210#comment-17282210 ]
ASF subversion and git services commented on KUDU-3212: ------------------------------------------------------- Commit 32c5b9c60bc923e27f0a4c78508fc3e2d1276e28 in kudu's branch refs/heads/master from Alexey Serbin [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=32c5b9c ] [master] turn off client location assignment by default This patch turns off location assignment by default to clients connecting to a Kudu cluster. The assigned locations are cached, but the way how locations assignment performed is resource consuming, see [1] for details. There aren't many benefits in assigning locations to clients so far: the only nice property of a client with assigned location vs a client with no location assigned is that former case the client run from a particular location would choose tablet servers in the same location if performing scan in ReplicaSelection::CLOSEST_REPLICA mode. This is a first step towards addressing KUDU-3212. [1] https://issues.apache.org/jira/browse/KUDU-3212 Change-Id: I78474ced0a0129b3f2b1add55f6f908a136106d0 Reviewed-on: http://gerrit.cloudera.org:8080/17024 Tested-by: Kudu Jenkins Reviewed-by: Hao Hao <hao....@cloudera.com> Reviewed-by: Grant Henke <granthe...@apache.org> > Location assignment improvements > -------------------------------- > > Key: KUDU-3212 > URL: https://issues.apache.org/jira/browse/KUDU-3212 > Project: Kudu > Issue Type: Improvement > Components: client, master, tserver > Affects Versions: 1.10.1, 1.12.0, 1.11.1, 1.13.0 > Reporter: Alexey Serbin > Priority: Major > Labels: performance, scalability > > Current implementation of location assignment has some room for improvement. > As of now, the following is understood: > # Implementation-wise, Kudu masters could use newly introduced > [Subprocess|https://github.com/apache/kudu/tree/master/src/kudu/subprocess] > functionality to run location assignment script. That would be more robust > than using current fork/exec approach to run the script, especially for > larger deployments where Kudu masters might have high request-per-second > ratio (many active threads running, a lot of memory allocated, etc.) > # Conceptually, Kudu tablet servers could have all the necessary information > regarding their location at startup and that information isn't going to > change while tablet server is running. The server/machine they are running at > is provisioned to be in some rack, availability zone, data center, etc. and > that assignment isn't changing while the server is up and running. So, a > Kudu tablet server can be provided with information about its location upon > startup; there is no need to consult Kudu master about this. > # Conceptually, Kudu clients might be aware of their location as well. > To address item 1, it's necessary to update current implementation of > location assignment, so the script should be run by a dedicated subprocess > forked off earlier during master's startup. Ideally, to make it more robust, > the subprocess server can run the location assignment script as a small > server that takes an IP or DNS name on input and provides location label on > the output, maybe line-by-line. The latter assumes chaning the requirement > for a location assignment script, and probably we should introduce a separate > flag to specify the path to a script that is running in such mode. However, > even with current location assignment approach when it's necessary to run a > script per every location assignment request, using the {{Subprocess}} > functionality would benefit larger deployments where fork/exec sequence for a > {{kudu-master}} process is slow and inefficient. > To address item 2, it's necessary to introduce a new tablet server's flag > that is set to the assigned location for the tablet server. The > systemd/init.d startup script for kudu-tserver should populate the flag with > proper value. It's also necessary to introduce a new field in the > {{TSHeartbeatRequestPB}} message to pass the location from tablet server to > master. If master sees the field populated, it should not run the location > assignment script, even if the location assignment script is set specified > (i.e. {{\-\-location_mapping_cmd}} flag is set). This way it would be > possible to perform rolling upgrades from older versions which use centrally > managed location assignment script to the version that implements the new > approach. > To address item 3, it's necessary to find a means to specify location for a > Kudu client. Probably, an environment variable can be used for that. The > {{ConnectToMasterRequestPB}} can be extended to include an optional > {{client_location}} field. In addition, if > {{\-\-master_client_location_assignment_enabled}} is set to {{true}}, master > could run the location assignment script to assign location to a client which > doesn't populate the newly introduced > {{ConnectToMasterRequestPB::client_location}} field. -- This message was sent by Atlassian Jira (v8.3.4#803005)