> I don't think *condition1* and *condition2* will work as labels and
label values returned by condition1 and condition2 are different.
condition1 if on (instance,mountpoint) group_left(username) condition2
This assumes that the both expressions have "instance" and "mountpoint"
labels; these are the only ones considered when matching. It also assumes
there is a many-to-1 relationship from the left-hand size (users) to right
hand side (filesystem), and that there is a label "username" that you would
like carried forward from the LHS into the result.
> So i need 3 rules - 1 each for server1,server2 and server3
I don't think so. The vector of results can include values for each
(user,filesystem,instance) on the LHS, and each (filesystem,instnace) on
the RHS, and alert separately for every filesystem that reaches 90%.
On Wednesday 28 February 2024 at 22:55:11 UTC+7 Puneet Singh wrote:
> Hi All,
> I have a monitoring requirement related to the user level disk usage and
> alerting. And i am wondering if prometheus is the correct tool to handle
> this requirement or,
> a custom python script (whish uses os, subprocess, smtp module) to
> handle monitoring and alerting will be optimial solution in this context?
>
>
> Here is the problem description -
> In our setup we have 3 servers we have a single mount point "/", and each
> user's directory, such as "/home/user1", "/home/user2", and so forth,
> resides within this mount point.
> [image: Untitled11.png]
> We enforce disk quotas for individual users, and our goal is to monitor
> each user's disk usage and trigger alerts to the top 10 users when overall
> quota exceeds 90%.
>
>
> Challenges:
> 1. Afaik, prometheus monitors the overall storage status and the
> mountpoint information, so individual user's disk consumption is not being
> tracked by Prometheus. Example -
> [image: Untitled12.png]
>
> a) Do i need to write custom exporter here which uses du -sh to figure out
> the disk usage ? where
> user_disk_usage_bytes{*username="ravi"*} 390000
>
> b) or node exporter can do this?
>
>
>
>
> after data collection, i need to deal with alerting rule
> 2. Here is the alert condition on the custom exporter-
>
> *condition1:* can help determine the users who have high usage
> topk*( * user_disk_usage_bytes* / * *scalar(*
> node_filesystem_size_bytes{instance="server1:9100",mountpoint='/'}*) ) *
>
> *condition2:* this can help determine if the usage has reached 90%
> (available space less than 10%)
> ( node_filesystem_avail_bytes{instance="server1:9100",mountpoint='/'}
> / node_filesystem_size_bytes{ instance="server1:9100",mountpoint='/' }
> ) < 0.1
>
> I don't think *condition1* and *condition2* will work as labels and
> label values returned by condition1 and condition2 are different.
>
> Is there a way to achieve this with PromQL ?
>
> Now, assuming that i am able to get a list of users if system utilization
> is 90% as -
> {username="ravi"} 80
> {username="user1"} 90
> {username="user2"} 70
> {username="user3"} 80
> {username="user4"} 90
>
> the alerting rule will be
> groups:
> - name: example
> rules:
> - alert: Storage space is low on server1
> expr: *condition1* and *condition2*
> for: 10m
> labels: alertname: "Server1's Storage space is running low, Please
> cleanup the disk space - {{ $labels.username }}" annotations:
> summary: "you are using {{ $value }}% space on the / space.please
> cleanup."
> So i need 3 rules - 1 each for server1,server2 and server3
>
> 3. Now alert manager is responsible to sending out the alerts
> And to send the alert , i think this should be the configuration in
> current context -
> [image: Untitled14.png]
> as i have already included username in the alert name , and by default
> grouping of alert happens by alertname so i think with this setting 1:1
> email should be sent to each user.
>
>
>
> Apologies for the lengthy post , but I have tried expressing the flow to
> solve this problem based on my understanding of Prometheus so far.
>
> I would greatly appreciate any insights, recommendations, or best
> practices i can get can offer in achieving dynamic user disk usage
> monitoring with Prometheus and Alert Manager.
>
> Thank you in advance .
>
> Best regards,
> Puneet
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/049b709b-8a09-4a49-9a71-f29a24314f30n%40googlegroups.com.