> [ ... ] adding the new machines to the CellServDB before the > new server is up. You could bring up e.g. dbserver 4, and only > after you're sure it's up and available, then add it to the > client CellServDB. Then remove dbserver #3 from the client > CellServDB, and then turn off dbserver #3.
For the client 'CellServDB' I simply did not expect any issues: my expectation was that the clients would scan very quickly the list of those addresses, starting with the lowest numbered for example, and finding a live one member of "quorum", and then if ncessary getting from it the address of the sync site; which is close to what it seems to do, only very slowly. I would have wished to put all 6 (different) IP addresses (3 up, 3 down) in the client 'CellServDB' and in 'fs newcell' to minimize the number of times I would do updates, but I could not because of a local configuration management system that puts the same list in the client and server 'CellServDB'. But done manually on a test client seemed to work fine, except for the 'vos' clients and their very long search timeouts. My real issue was 'server/CellServeDB' because we could not prepare ahead of time all 3 new servers, but only one at a time. The issue is that with 'server/CellServDB' update there is potentially a DB daemon (PT, VL) restart (even if the rekeying instructions hint that when the mtime of 'server/CellServDB' changes the DB daemons reread it) and in any case a sync site election. Because each election causes a "blip" with the client I would rather change the 'server/CellServDB' by putting in extra entries ahead of time or leaving in entries for disabled servers, to reduce the number of times elections are triggered. Otherwise I can only update one server per week... Ideally if I want to reshape the cell from DB servers 1, 2, 3 to 4, 5, 6, I'd love to be able to do it by first putting in the 'server/CellServDB' all 6 with 4, 5, 6 not yet available, and only at the end remove 1, 2, 3. What does not play well (if one of the 3 live servers fails) with the "quorum" :-) so went halfway. > You would need to keep the server-side CellServDB accurate on > the dbservers in order for them to work, but the client > CellServDB files can be missing dbservers. [ ... ] It would be nice to know more about the details here to make planning easier in future updates. For example in an ideal world putting more or less DB servers in the client 'CellServDB' should not matter, as long as one that belongs to the cell is up; again if the logic were for all types of client: "scan quickly the list of potential DB servers, find one that is up and belongs to the cell and reckons is part of the quorum, and if necessary get from it the address of the sync site". Similarly (within limits) deliberately having non-up DB servers to the 'server/CellServDB' should not matter that much, because non-up DB servers happen anyhow in case of failures. _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
