[ https://issues.apache.org/jira/browse/KUDU-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Serbin updated KUDU-3212: -------------------------------- Affects Version/s: 1.10.1 1.12.0 1.11.1 > Location assignment improvements > -------------------------------- > > Key: KUDU-3212 > URL: https://issues.apache.org/jira/browse/KUDU-3212 > Project: Kudu > Issue Type: Improvement > Components: client, master, tserver > Affects Versions: 1.10.1, 1.12.0, 1.11.1, 1.13.0 > Reporter: Alexey Serbin > Priority: Major > Labels: performance, scalability > > Current implementation of location assignment has some room for improvement. > As of now, the following is understood: > # Implementation-wise, Kudu masters could use newly introduced > [Subprocess|https://github.com/apache/kudu/tree/master/src/kudu/subprocess] > functionality to run location assignment script. That would be more robust > than using current fork/exec approach to run the script, especially for > larger deployments where Kudu masters might have high request-per-second > ratio (many active threads running, a lot of memory allocated, etc.) > # Conceptually, Kudu tablet servers could have all the necessary information > regarding their location at startup and that information isn't going to > change while tablet server is running. The server/machine they are running at > is provisioned to be in some rack, availability zone, data center, etc. and > that assignment isn't changing while the server is up and running. So, a > Kudu tablet server can be provided with information about its location upon > startup; there is no need to consult Kudu master about this. > # Conceptually, Kudu clients might be aware of their location as well. > To address item 1, it's necessary to update current implementation of > location assignment, so the script should be run by a dedicated subprocess > forked off earlier during master's startup. Ideally, to make it more robust, > the subprocess server can run the location assignment script as a small > server that takes an IP or DNS name on input and provides location label on > the output, maybe line-by-line. The latter assumes chaning the requirement > for a location assignment script, and probably we should introduce a separate > flag to specify the path to a script that is running in such mode. However, > even with current location assignment approach when it's necessary to run a > script per every location assignment request, using the {{Subprocess}} > functionality would benefit larger deployments where fork/exec sequence for a > {{kudu-master}} process is slow and inefficient. > To address item 2, it's necessary to introduce a new tablet server's flag > that is set to the assigned location for the tablet server. The > systemd/init.d startup script for kudu-tserver should populate the flag with > proper value. It's also necessary to introduce a new field in the > {{TSHeartbeatRequestPB}} message to pass the location from tablet server to > master. If master sees the field populated, it should not run the location > assignment script, even if the location assignment script is set specified > (i.e. {{\-\-location_mapping_cmd}} flag is set). This way it would be > possible to perform rolling upgrades from older versions which use centrally > managed location assignment script to the version that implements the new > approach. > To address item 3, it's necessary to find a means to specify location for a > Kudu client. Probably, an environment variable can be used for that. The > {{ConnectToMasterRequestPB}} can be extended to include an optional > {{client_location}} field. In addition, if > {{\-\-master_client_location_assignment_enabled}} is set to {{true}}, master > could run the location assignment script to assign location to a client which > doesn't populate the newly introduced > {{ConnectToMasterRequestPB::client_location}} field. -- This message was sent by Atlassian Jira (v8.3.4#803005)