Monitoring
Real-time infrastructure health and metrics powered by Prometheus.
Overview
Umoo's monitoring stack integrates with a Prometheus server to visualize:
- Fleet-wide resource usage (CPU, memory, network)
- Per-device telemetry (agent uptime, ping RTT, disk, goroutines)
- Platform metrics (DB connections, ingress throughput)
Data is collected by the telemetry plugin running on each device and scraped by Prometheus.
Dashboards
Navigate to Monitor to see your monitoring dashboards.
New tenants get a default Fleet Overview dashboard with pre-built panels:
- Total Devices / Online Devices — stat panels
- Fleet Online Rate — gauge with color thresholds
- Avg CPU Usage / Avg Memory Usage / Avg Disk Usage — time series panels
Custom Dashboards
You can create, edit, and delete dashboards directly in the UI:
- Click + Create Dashboard.
- Add rows and panels with PromQL queries.
- Available panel types:
stat,gauge,timeseries,bar,table. - Set time range and refresh interval.
A Template Gallery provides pre-built dashboard templates you can apply with one click.
Device Monitor Tab
Individual device telemetry is available in the device detail Monitor tab.
Panels are organized into rows:
Health Row
| Panel | Metric |
|---|---|
| Agent Uptime | How long the umoo-agent has been running |
| Ping RTT | Round-trip time from backend to device |
Resources Row
| Panel | Metric |
|---|---|
| CPU Usage | Percentage across all cores |
| Memory | Heap memory usage (MB) |
| Disk Usage | Root filesystem usage percentage |
Network Row
| Panel | Metric |
|---|---|
| Network RX | Bytes received per second |
| Network TX | Bytes transmitted per second |
System Row
| Panel | Metric |
|---|---|
| Load Avg (1m) | Linux load average |
| Processes | Number of running processes |
| Agent Heap | Go heap allocation (MB) |
| Agent Goroutines | Number of active goroutines |
Telemetry Plugin Configuration
The telemetry plugin controls what metrics are collected and how often.
Configure it under Settings → Plugin Config (tenant-wide default) or per device group under Devices → Groups → [Group] → Plugins.
| Setting | Description |
|---|---|
| Collection Interval | Seconds between metric collection cycles (default: 30) |
| Enabled Metrics | Select which categories to collect |
Metric Categories
| Category | Metrics |
|---|---|
cpu | CPU usage percentage across all cores |
memory | Heap memory usage in megabytes |
disk | Disk usage percentage of root filesystem |
network | Network interface bytes received/transmitted |
runtime | Agent uptime, goroutines, memory allocation, Go version |
Disabling a category stops that data from being scraped and reduces agent resource usage.
Prometheus Integration
Umoo proxies Prometheus queries through /api/v1/metrics/. The monitoring UI uses this proxy — no direct Prometheus access is needed.
To configure the Prometheus URL, set it in the server configuration:
prometheus:
url: http://prometheus:9090Alerting
Alerts are managed separately via Alertmanager. See Alerts for details.