ctubbsii commented on issue #4973: URL: https://github.com/apache/accumulo/issues/4973#issuecomment-2418500452
I would suggest getting rid of the log aggregation on the monitor as well. It is quite a pain to do it properly and without killing the tservers, the network, or the monitor. Logs don't survive a restart, and are dropped if there are too many. And, any solution we come up with is going to be much more complex and less useful than a simple rsyslog setup, or a small scheduled rsync script. For large installations, it's worth setting up something suitable for log aggregation and analysis. For small installations, you can just ssh to a tserver and cat/grep/less the logs (what we do in development). Recently, we found a problem with too many TCP connections with our attempted fix for ensuring logging was async to the monitor in #4879. The proposed solutions aren't great. Things I'd want to keep are things that give you a big picture view of a deployed cluster: 1. List of namespaces, list of tables in each namespace 2. A page or view for each table (list of tablets, whether or not they are hosted) 3. List of servers, organized in resource groups and by server type 4. Overall health status for servers (some kind of obvious visual signal to indicate "healthy", "needs attention", "out of service", or similar) 5. Basic activity report for each server, depending on the server type (gc should report on when the last garbage collection ran, when the next is expected to run, for example; the manager should report on its core responsibilities, like running fate operations and balancing; compactors should report on whether they are compacting; tservers and sservers should report what tablets they are hosting/scanning and client scan sessions, etc.) These shouldn't duplicate merely reporting detailed metrics that can be obtained directly, but they could utilize some of the metrics to provide a more meaningful view of the status of the system or a particular component's health and status. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
