How to Monitor Cron Jobs

The most common way cron jobs fail is the one nobody plans for: they produce zero output. No error message, no log entry, no stack trace. The job simply doesn't run. Maybe the disk filled up and crontab couldn't write its temp file. Maybe someone ran crontab -r instead of crontab -e and wiped the entire schedule. Maybe the PATH changed after a system update and the binary can't be found anymore.

Here's a scenario that plays out every week somewhere: a nightly pg_dump backup job runs at 02:00 UTC. One Tuesday, the Postgres server moves to a new port after an upgrade. The cron job fails with “connection refused” — but since stdout and stderr aren't redirected, nobody sees it. Three weeks later, a developer drops a table by accident, reaches for the backup, and finds the most recent one is 21 days old. That's the cost of unmonitored cron jobs.

The 5 methods to monitor cron jobs

1. Redirect output to a log file

The simplest approach: pipe everything to a file and hope someone reads it.

# crontab -e
0 2 * * * /usr/local/bin/backup.sh >> /var/log/backup.log 2>&1

Pros: Zero setup beyond the redirect. Works everywhere.

Cons: Nobody reads log files proactively. If the job doesn't run at all (crontab cleared, crond stopped), the log file just… stops growing. Silently. You get no alert.

2. Send email on failure with MAILTO

Cron has a built-in email feature. Add MAILTO to the top of your crontab, and cron emails you whenever a job produces output.

MAILTO=ops@example.com
0 2 * * * /usr/local/bin/backup.sh

Pros: Built into cron. No extra tools.

Cons: Requires a working MTA (sendmail/postfix) on the server. Emails land in spam more often than not. Biggest problem: if the job doesn't run, there's no output, so no email. MAILTO only fires when there IS output — it cannot detect silence.

3. Wrapper scripts with error handling

Write a shell wrapper that checks the exit code and sends an alert to Slack (or wherever) on failure.

#!/bin/bash
# wrapper.sh — run a job, alert Slack on failure

/usr/local/bin/backup.sh
EXIT_CODE=$?

if [ $EXIT_CODE -ne 0 ]; then
  curl -s -X POST "https://hooks.slack.com/services/T00/B00/xxxxx" \
    -H "Content-Type: application/json" \
    -d "{\"text\":\"backup.sh failed with exit code $EXIT_CODE\"}"
fi

Pros: Flexible. You control the alert format and destination.

Cons: You write a wrapper per job (or a generic one you have to maintain). Still can't detect the case where the wrapper itself doesn't run — if cron stops, the wrapper never executes.

4. Systemd timers instead of cron

Replace crontab entries with systemd timer units. You get structured logging via journalctl and an OnFailure= directive that can trigger another unit on error.

# /etc/systemd/system/backup.timer
[Unit]
Description=Nightly database backup

[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true

[Install]
WantedBy=timers.target

# /etc/systemd/system/backup.service
[Unit]
Description=Database backup job
OnFailure=notify-failure@%n.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup.sh

Pros: Better logging than cron. Persistent=true runs missed jobs after a reboot. OnFailure= triggers on non-zero exits.

Cons: More complex setup. Two files per job instead of one crontab line. OnFailure= still only catches failures that produce a non-zero exit code — if the process hangs or the timer unit is disabled, you get nothing.

5. Heartbeat monitoring (the recommended approach)

Heartbeat monitoring flips the model. Instead of watching for errors, you watch for absence of success. Your cron job pings a URL every time it completes. If the monitoring service doesn't receive a ping within the expected window, it alerts you. This catches every failure mode: crashes, silent failures, hung processes, disabled crontabs, rebooted servers.

Here's what it looks like in four languages:

# Bash — add to the end of your cron job
0 2 * * * /usr/local/bin/backup.sh && curl -s https://api.getcronsafe.com/ping/abc123
# Python
import requests

def run_backup():
    # ... your backup logic ...
    pass

run_backup()
requests.get("https://api.getcronsafe.com/ping/abc123", timeout=5)
// Node.js
async function runBackup() {
  // ... your backup logic ...
}

await runBackup();
await fetch("https://api.getcronsafe.com/ping/abc123");
// PHP
function run_backup() {
    // ... your backup logic ...
}

run_backup();
file_get_contents("https://api.getcronsafe.com/ping/abc123");

The && in the bash example is important: the ping only fires if the job exits with code 0. A failed job won't ping, which triggers the “missing ping” alert.

Comparison table

MethodCatches silent failuresSetup timeMaintenance
Log filesNo1 minRead logs daily
Email (MAILTO)No1 minNone
Wrapper scriptPartially30 minPer job
Systemd timersPartially15 minPer job
Heartbeat monitoringYes30 secNone

Setting up heartbeat monitoring in 60 seconds

The steps are the same regardless of which service you pick (CronSafe, Healthchecks.io, Cronitor, or others):

  1. Create an account on the monitoring service. Most have a free tier.
  2. Create a monitor and set the expected interval. If your job runs every hour, set the interval to 60 minutes and the grace period to 15 minutes.
  3. Copy the ping URL and add it to the end of your cron job with curl -s.
  4. Test it by running the cron job manually: bash /path/to/your/script.sh. The monitor status should change from “pending” to “up”.

CronSafe is free for 5 monitors with no expiration.

FAQ

What happens when a cron job fails silently?

Nothing visible. The job produces no output, no error log, and no notification. You only discover it when the consequences appear — missing backups, stale data, angry clients. Silent failures are the most dangerous kind because the feedback loop can be days or weeks long.

Can I monitor cron jobs without installing anything?

Yes. Heartbeat monitoring only requires adding a curl command to the end of your cron job. No agent to install, no SDK to import, no config file to maintain. The server you're monitoring doesn't need any new software.

How often should I check if my cron jobs are running?

Set the monitoring window to match your job's schedule plus a buffer. If your job runs every hour, set the check window to 75 minutes — that allows 15 minutes for slow execution before triggering an alert. For daily jobs, a 25-hour window works well.