PVE host disk space monitoring - from pitfalls to UptimeKuma

Published: 2025-05-14

Target Reader This article is written for readers who have just started using Proxmox VE (referred to as PVE) and want to copy the steps and get it running. The article aims to:

explain why the problem happens

provide the full command at each step

give troubleshooting ideas when something goes wrong

0 Background - What went wrong?

The official PVE installation wizard defaults to putting the VM disk into the LVM-Thin pool (commonly pve/data)
LVM-Thin supports “over-allocation” - a VM may be allocated 500G, while the physical disk only consumes actual writes
If the host’s LVM-Thin pool is nearly full, the most immediate symptoms are:
- The host cannot write logs and the web interface may stop opening.
- The virtual machine may report I/O errors or even crash.

Conclusion: Be sure to monitor data_percent (pool usage percentage) in real time.

1 General idea

Steps	Function	Technology
A	Write a check script `lvm_check.sh`: return JSON and determine OK / FAIL	Shell + jq + bc
B	Write a push script `lvm_kuma_push.sh`: send the results to UptimeKuma	Shell + curl
C	Create a new Push monitor in UptimeKuma	Web GUI
D	Use cron to run the push script regularly	Cron

2 Prepare the environment

Log in to the host (not the virtual machine) and make sure you have root permissions.
Install the required tools:
Confirm that the host is indeed using LVM-Thin:

sudo apt update
sudo apt install -y jq bc curl

If you can see pve/data and the pool_lv field is data, this article applies to your setup.

lvs -o lv_name,vg_name,pool_lv,data_percent --noheadings

3 Step A: Write the inspection script `/usr/local/bin/lvm_check.sh`

sudo nano /usr/local/bin/lvm_check.sh

#!/bin/bash
# Determine whether the LVM-Thin pool usage exceeds the threshold (default 90%).
THRESHOLD=90
output='{"status":"ok","volumes":[]}'

while read -r lv vg usage lsize pool; do
  usage=$(echo "$usage" | tr -d ' %')
  [ "$pool" != "data" ] && continue
  [ -z "$usage" ] && continue

  if [[ "$usage" =~ ^[0-9]+(\.[0-9]+)?$ ]]; then
    if [ "$(echo "$usage > $THRESHOLD" | bc -l)" -eq 1 ]; then
      output=$(echo "$output" | jq '.status="fail"')
      output=$(echo "$output" | jq --arg lv "$lv" --arg vg "$vg" \
                                   --arg usage "$usage" --arg lsize "$lsize" \
          '.volumes += [{"lv":$lv,"vg":$vg,"usage":$usage,"lsize":$lsize}]')
    fi
  fi
done < <(lvs --noheadings -o lv_name,vg_name,data_percent,lv_size,pool_lv \
             --units g --nosuffix)

echo "$output"

sudo chmod +x /usr/local/bin/lvm_check.sh

Manual testing

/usr/local/bin/lvm_check.sh | jq

status":"ok" -> the disk is healthy
status":"fail" -> JSON lists the over-limit volumes

4 Step C: Create Push monitoring in UptimeKuma

Open the UptimeKuma web interface -> New Monitor
Select Type = Push
Give a name (example: PVE‑LVM‑Disk)
Set Heartbeat Interval = 300 seconds (can be changed)
After saving, record the Push URL generated by the system, in the form:

https://uptime.example.com/api/push/abcdef1234567890

5 Step B: Push the script to send the results to Kuma

5.1 Write the Push URL into the configuration

echo 'PUSH_URL="https://uptime.example.com/api/push/abcdef1234567890"' \
  | sudo tee /etc/lvm_check.conf

5.2 New script `/usr/local/bin/lvm_kuma_push.sh`

sudo nano /usr/local/bin/lvm_kuma_push.sh

#!/bin/bash
# Send the inspection result to UptimeKuma.
source /etc/lvm_check.conf
[ -z "$PUSH_URL" ] && { echo "missing PUSH_URL"; exit 1; }

result="$(/usr/local/bin/lvm_check.sh)"
status=$(echo "$result" | jq -r '.status')

if [ "$status" = "ok" ]; then
  curl -fsS --retry 3 "${PUSH_URL}?status=up&msg=OK" \
       -H 'Content-Type: application/json' \
       --data-raw "$result"
else
  problem=$(echo "$result" | jq -c '.volumes')
  curl -fsS --retry 3 "${PUSH_URL}?status=down&msg=${problem}" \
       -H 'Content-Type: application/json' \
       --data-raw "$result"
fi

sudo chmod +x /usr/local/bin/lvm_kuma_push.sh

Verify once

/usr/local/bin/lvm_kuma_push.sh
# No error means the push succeeded, and Kuma should show a heartbeat

6 Step D: Use cron to execute regularly

At the end of the file add:

Meaning: Run the push every 5 minutes, consistent with the 300-second interval in Kuma.

After saving and exiting, you can use sudo systemctl restart cron (Debian12) to ensure that the cron service is running.

sudo crontab -e

*/5 * * * * /usr/local/bin/lvm_kuma_push.sh >/dev/null 2>&1

7 How to verify the entire link?

Wait 5-10 minutes to see whether UptimeKuma’s PVE-LVM-Disk has turned green (Up).
Manually create high usage (or temporarily change THRESHOLD=90 to 1 in the script) and run the push again:

Kuma turns red (Down) immediately, indicating that the alert path works.
Change it back to 90 or free up space, push again, and the status will return to Up.

8 Common errors and troubleshooting

Error/Phenomenon	Cause & Solution
`jq: command not found`	Run `sudo apt install -y jq`
Stderr: `curl: (28) Connection timed out`	The network from the host to Kuma is unreachable; test `curl -I $PUSH_URL` first
Kuma page shows “No Heartbeat”	1) cron is not taking effect; 2) PUSH_URL is wrong; 3) the script failed before pushing (check `/var/log/syslog`)
`status":"fail"` but `.volumes` is empty	The host is not using LVM-Thin; use `zfs list` or other methods to monitor

9 Can be expanded later

Ideas	Directions
Graphical trends	Kuma’s “Status Page” only shows Up/Down, but JSON can be converted into Prometheus metrics for charts
Email/DingTalk notification	Configure it in Kuma’s Notification; the script does not need to change
systemd-timer	If you do not like cron, you can write `lvm_kuma_push.timer` for higher accuracy

10 Summary

Minimal dependencies: only jq, bc, and curl; all scripts live on the host, with zero changes on the VM side.
Timely alerts: when the LVM-Thin pool exceeds the threshold, UptimeKuma turns red and notifies immediately.
Clear steps: beginners can reproduce this by copy-pasting the article, and failure scenarios are covered too.

I hope this note helps you nip disk-full anxiety in the bud. If you have a cleaner approach or hit another pitfall, share it in the comments.