Repository locks (Restic)¶
Restic stores lock files in the repository (under locks/ on most backends)
so incompatible operations do not run at the same time. This page explains
when you see locks, why they sometimes remain, and what this image does
to keep you safe — especially on multi-host repositories.
For symptom-driven fixes, see Troubleshooting → Locking. For the audited unlock helper, see Unlock.
How Restic uses locks¶
- Non-exclusive locks — used for work that can overlap with other readers
or writers in a controlled way (typical
restic backuptraffic). - Exclusive locks — required for operations that must not race with others
(for example
restic forget,restic prune, andrestic checkin common setups). Only one holder at a time.
Which command you run therefore determines whether you see contention,
immediate exit 11 ("failed to lock repository"), or long waits.
Upstream background and interrupted runs:
- Restic may leave a lock behind if a process is killed (OOM,
SIGKILL, host reboot) or loses connectivity before it can clean up. Official troubleshooting describes interrupted commands and manual recovery withrestic unlockwhen appropriate: Restic troubleshooting.
Stale lock vs legitimate lock¶
| Situation | What it means | What to do |
|---|---|---|
| Stale | The process that created the lock is gone; no other job should be using the repo. | Inspect with restic list locks, then clear with /bin/unlock (or raw restic unlock if you accept unmasked logs). |
| Legitimate | Another host or container still holds the lock, or a long job is in flight. | Do not unlock: wait, reschedule, or use --retry-lock / dedicated cron windows (below). |
Exit 11 on forget |
Another client holds the exclusive forget lock, or forget raced with another forget. | Not a reason to unlock blindly — see Backup worker → Multi-host and Forget worker. |
What this image does for you¶
Safer default: RESTIC_AUTO_UNLOCK=OFF¶
Since 1.12.0, /bin/backup and /bin/check do not automatically run
restic unlock after a failure. That avoids deleting another host's valid
lock on a shared repository. Opt in with RESTIC_AUTO_UNLOCK=ON only when
one machine ever uses the repository.
See Environment variables → RESTIC_AUTO_UNLOCK.
Never auto-unlock on forget exit 11¶
Inline restic forget and the standalone /bin/forget worker never invoke
restic unlock when Restic returns 11: the lock you lost is almost always
someone else's exclusive lock. Clearing it would allow concurrent mutations.
Dedicated forget window: FORGET_CRON¶
When FORGET_CRON is set, /bin/backup skips inline post-backup forget and
the standalone /bin/forget worker owns the exclusive lock window. That removes
the classic multi-host race where two backups finish together and both try to
forget.
Local overlap: /bin/locked_run¶
Cron jobs are wrapped with flock so the same container does not start a
second backup (or check, …) while the previous tick is still running. That is
independent of Restic's repository locks but reduces self-inflicted overlap.
Explicit operator unlock: /bin/unlock¶
When you have confirmed a lock is stale, use /bin/unlock for
masked logging, last-unlock.json, hooks, mail and webhooks — same audit
surface as other helpers.
Preventive checklist¶
- Shared repository — keep
RESTIC_AUTO_UNLOCK=OFF; use/bin/unlockonly afterrestic list locksshows no legitimate holder. - Retention — set
FORGET_CRONon multi-host repos; reuseRESTIC_FORGET_ARGSand stagger schedules between hosts (or run forget from a single "owner" deployment). --retry-lock=DURATION— add toRESTIC_FORGET_ARGS(Restic ≥ 0.16) so forget waits for the exclusive lock instead of returning11immediately.- Heavy maintenance — run
PRUNE_CRONandCHECK_CRONon one owner container per repository where possible, so N replicas do not all schedule the same exclusive work. - Graceful shutdown — give the container SIGTERM and enough stop grace time so Restic can exit cleanly; hard kills correlate with stale locks.
- Monitoring — alert on repeated non-zero exits and on
restic_backup_last_forget_exit_code/ forget worker metrics when11persists (schedule collision, not a one-off skip). See Prometheus metrics.
Further reading¶
- Unlock — operator helper and
--dry-run/--remove-all. - Backup worker → Multi-host and exit 11.
- Check worker — scheduling checks on multi-host repos.
- Forget worker —
FORGET_CRONand exit11handling.