Monitoring

Real-time infrastructure health and metrics powered by Prometheus.

Overview

Umoo's monitoring stack integrates with a Prometheus server to visualize:

Fleet-wide resource usage (CPU, memory, network)
Per-device telemetry (agent uptime, ping RTT, disk, goroutines)
Platform metrics (DB connections, ingress throughput)

Data is collected by the telemetry plugin running on each device and scraped by Prometheus.

Dashboards

Navigate to Monitor to see your monitoring dashboards.

New tenants get a default Fleet Overview dashboard with pre-built panels:

Total Devices / Online Devices — stat panels
Fleet Online Rate — gauge with color thresholds
Avg CPU Usage / Avg Memory Usage / Avg Disk Usage — time series panels

Custom Dashboards

You can create, edit, and delete dashboards directly in the UI:

Click + Create Dashboard.
Add rows and panels with PromQL queries.
Available panel types: stat, gauge, timeseries, bar, table.
Set time range and refresh interval.

A Template Gallery provides pre-built dashboard templates you can apply with one click.

Device Monitor Tab

Individual device telemetry is available in the device detail Monitor tab.

Panels are organized into rows:

Health Row

Panel	Metric
Agent Uptime	How long the umoo-agent has been running
Ping RTT	Round-trip time from backend to device

Resources Row

Panel	Metric
CPU Usage	Percentage across all cores
Memory	Heap memory usage (MB)
Disk Usage	Root filesystem usage percentage

Network Row

Panel	Metric
Network RX	Bytes received per second
Network TX	Bytes transmitted per second

System Row

Panel	Metric
Load Avg (1m)	Linux load average
Processes	Number of running processes
Agent Heap	Go heap allocation (MB)
Agent Goroutines	Number of active goroutines

Telemetry Plugin Configuration

The telemetry plugin controls what metrics are collected and how often.

Configure it under Settings → Plugin Config (tenant-wide default) or per device group under Devices → Groups → [Group] → Plugins.

Setting	Description
Collection Interval	Seconds between metric collection cycles (default: 30)
Enabled Metrics	Select which categories to collect

Metric Categories

Category	Metrics
`cpu`	CPU usage percentage across all cores
`memory`	Heap memory usage in megabytes
`disk`	Disk usage percentage of root filesystem
`network`	Network interface bytes received/transmitted
`runtime`	Agent uptime, goroutines, memory allocation, Go version

Disabling a category stops that data from being scraped and reduces agent resource usage.

Prometheus Integration

Umoo proxies Prometheus queries through /api/v1/metrics/. The monitoring UI uses this proxy — no direct Prometheus access is needed.

To configure the Prometheus URL, set it in the server configuration:

yaml

prometheus:
  url: http://prometheus:9090

Alerting

Alerts are managed separately via Alertmanager. See Alerts for details.

Monitoring ​

Overview ​

Dashboards ​

Custom Dashboards ​

Device Monitor Tab ​

Health Row ​

Resources Row ​

Network Row ​

System Row ​

Telemetry Plugin Configuration ​

Metric Categories ​

Prometheus Integration ​

Alerting ​