Automating System Restarts on CPU Overload Using systemd

When managing a server or workstation, unexpected CPU spikes can slow down the system, degrade performance, or even lead to crashes. A proactive solution is to automate system restarts when the CPU usage exceeds a threshold. Instead of relying on inefficient while loops, we’ll leverage the systemd service and timer mechanism for optimized event-based execution.

Why Use `systemd` Instead of Polling?

Most traditional monitoring scripts rely on a while loop that constantly checks CPU usage, consuming resources unnecessarily. With systemd, we:

Avoid inefficient polling.
Ensure scheduled execution without constant CPU usage.
Easily control activation intervals using timers.

Step 1: Creating the CPU Monitoring Script

We will write a Bash script to check CPU usage and trigger a restart if it surpasses a set threshold.

1. Write the Script

Open a terminal and create the script file:

sudo nano /usr/local/bin/cpu_monitor.sh

Add the following content:

#!/bin/bash

THRESHOLD=90  # Define CPU usage limit
CPU_USAGE=$(grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$4+$5)} END {print int(usage)}')

if [ "$CPU_USAGE" -ge "$THRESHOLD" ]; then
    echo "High CPU detected ($CPU_USAGE%), rebooting system..."
    sudo reboot
fi

2. Make the Script Executable

sudo chmod +x /usr/local/bin/cpu_monitor.sh

This script:
✔ Extracts CPU usage directly from /proc/stat without external dependencies.
✔ Uses integer-based comparison to eliminate the need for bc.
✔ Triggers a reboot if CPU usage exceeds the predefined threshold.

Step 2: Setting Up a `systemd` Service

Next, we create a systemd service to execute this script.

1. Create a New Service File

sudo nano /etc/systemd/system/cpu-monitor.service

Add the following content:

[Unit]
Description=Monitor CPU usage and restart if overloaded
After=network.target

[Service]
ExecStart=/usr/local/bin/cpu_monitor.sh
Type=simple

[Install]
WantedBy=multi-user.target

2. Reload Systemd and Enable the Service

sudo systemctl daemon-reload
sudo systemctl enable cpu-monitor.service

At this point, we have created a service that can manually run our script.

Step 3: Scheduling the Script with a `systemd` Timer

Instead of manually running the service, we configure a timer for automated execution.

1. Create a Timer File

sudo nano /etc/systemd/system/cpu-monitor.timer

Add the following content:

[Unit]
Description=Run CPU monitor every 10 seconds

[Timer]
OnUnitActiveSec=10s
Unit=cpu-monitor.service
Persistent=true

[Install]
WantedBy=timers.target

This ensures:
✔ The service runs every 10 minutes.
✔ It activates only when necessary instead of constant polling.

2. Enable and Start the Timer

sudo systemctl enable cpu-monitor.timer
sudo systemctl start cpu-monitor.timer

3. Verify the Timer

To ensure the timer is functioning:

systemctl list-timers --all | grep cpu-monitor

To manually trigger the service:

sudo systemctl start cpu-monitor.service

To check logs for execution status:

journalctl -u cpu-monitor.service --no-pager

Step 4: Fine-Tuning for Stability

Before deploying this setup on a production server, consider:
✔ Logging CPU spikes before rebooting for debugging.
✔ Triggering alerts instead of an immediate reboot (via email or notifications).
✔ Including RAM and Disk usage checks for more robust monitoring.

Using systemd, we have built an efficient and event-driven solution to restart a system under high CPU load. Unlike polling loops, this method preserves resources and runs only when needed.

Why Use systemd Instead of Polling?

Step 1: Creating the CPU Monitoring Script

1. Write the Script

2. Make the Script Executable

Step 2: Setting Up a systemd Service

1. Create a New Service File

2. Reload Systemd and Enable the Service

Step 3: Scheduling the Script with a systemd Timer

1. Create a Timer File

2. Enable and Start the Timer

3. Verify the Timer

Step 4: Fine-Tuning for Stability

Author: tayyebi

Related Posts

Why Use `systemd` Instead of Polling?

Step 2: Setting Up a `systemd` Service

Step 3: Scheduling the Script with a `systemd` Timer