Upgrading¶
This page captures the upgrade notes for every release that needs operator attention. Patch releases that only fix bugs without changing behaviour are omitted; see the full Changelog for every release.
How versions are cut¶
| Bump | Trigger | What you should expect |
|---|---|---|
| PATCH | Bugfix, docs-only change, rebuild tweak, Restic patch bump without behaviour change. | Drop in. No config changes required. |
| MINOR | New feature, new env variable, new script hook, materially new behaviour. | Drop in; pre-existing behaviour preserved. May expose new optional knobs. |
| MAJOR | Breaking configuration, path, or runtime contract change. | Read the upgrade note carefully. May require config rename. |
See Versioning policy for the full semver contract.
2.4.0 → 2.5.0¶
Multi-host retention hardening. Three composing changes, all backwards
compatible — single-host setups and operators who never touched
RESTIC_FORGET_ARGS see no behaviour change.
1. New standalone forget worker (FORGET_CRON)¶
/bin/forget is a new worker that mirrors the existing /bin/prune
shape: own flock on /var/run/forget.lock, own log
(/var/log/forget-last.log), own JSON summary
(/var/log/last-forget.json), own Prometheus textfile
(restic_forget.prom), own mail subject ([OK|FAIL N] Forget …),
own webhook payload, own pre-forget / post-forget "$rc" hooks. It
reuses RESTIC_FORGET_ARGS verbatim — no duplicate retention env
var.
Activate by setting FORGET_CRON (default: empty). When set,
/bin/backup automatically skips its inline post-backup forget
(cron log records ⏭ Skipping inline forget: FORGET_CRON is set …)
so the repository's exclusive forget-lock is only ever taken inside
this dedicated maintenance window. Recommended for repositories
shared by multiple hosts: eliminates the exit-11 race entirely.
environment:
FORGET_CRON: "30 1 * * *"
RESTIC_FORGET_ARGS: "--retry-lock=5m --keep-daily 7 --keep-weekly 8 --keep-monthly 12"
See Forget worker for the full state machine, sample configurations and exit-code reference.
2. Soft-skip semantic for restic forget exit 11¶
On a repository shared by multiple hosts, two simultaneous backups
can race for the exclusive lock that restic forget needs. Only one
acquires it; the other returns restic exit 11 ("failed to lock
repository"). Until 2.4.0 the worker treated that as a hard failure
(❌ Forget Failed with Status 11) and, with RESTIC_AUTO_UNLOCK=ON,
would unlock the other host's legitimate lock — a foot-gun on
multi-host setups.
2.5.0 downgrades the exit-11 case to an informational skip in both
the inline path (/bin/backup) and the new standalone worker
(/bin/forget):
- Cron log records
⏭ Forget skipped: repository was locked by another host (exit 11). Retention will catch up on the next backup tick. - The backup itself still exits
0;last-backup.jsonkeepsexit_code: 0. restic unlockis intentionally never run on exit 11 regardless ofRESTIC_AUTO_UNLOCK(the lock we lost is another host's legitimate lock).- All other non-zero forget exits keep their existing fail-loud handling.
3. forget_exit_code field in last-backup.json¶
The inline forget result is now recorded separately as
forget_exit_code: <0|11|other> alongside exit_code in
last-backup.json. The value is auto-promoted to a
restic_backup_last_forget_exit_code Prometheus gauge so monitoring
can alert on persistent skipping without false-flagging the backup
itself. (The standalone worker exposes the same number as its own
top-level exit_code plus a restic_forget_last_exit_code gauge.)
Upgrade actions¶
- Single host, no
RESTIC_FORGET_ARGSset: drop-in, nothing to change. - Single host,
RESTIC_FORGET_ARGSset: drop-in. Consider adding--retry-lock=DURATIONif you ever expect to add a second host. - Multi-host, hitting
❌ Forget Failed with Status 11: the failures are now⏭skips automatically. For a permanent fix, pick one of:- Best: set
FORGET_CRONon a single maintenance-owner container (or on each container with staggered times) and let the dedicated worker own the lock window. - Add
--retry-lock=5mtoRESTIC_FORGET_ARGS. - Stagger
BACKUP_CRONbetween hosts.
- Best: set
See Backup worker → Multi-host repositories and exit 11 for the full story.
2.3.x → 2.4.0¶
Purely additive. New /bin/mount-snapshot helper wraps restic mount
(FUSE) read-only under /fusemount, scoped to this container's
--host "$HOSTNAME" and --tag "$RESTIC_TAG" by default.
Defaults are designed for the common case "give me last night's
snapshot tree, scoped to this host": mounts on /fusemount
(container-internal by design, created at image build, never collides
with /bin/restore output or a host bind-mount on /restore).
Created if missing, must be empty unless --force. The helper refuses
/data, BACKUP_ROOT_DIR and other system/source directories
without --force, and registers an EXIT trap that calls
fusermount -u (with umount fallback) so SIGINT, SIGTERM or a
restic crash always unmounts cleanly.
Use --repo-wide to expose every snapshot, --path (repeatable) to
filter by snapshot path, and --allow-other when another UID (e.g. a
host bind-mount consumer) needs read access to the FUSE tree.
FUSE inside the container still requires --cap-add SYS_ADMIN
--device /dev/fuse (or the Kubernetes securityContext equivalents).
On Ubuntu/Debian hosts (Docker's default AppArmor profile) you also
need --security-opt apparmor=unconfined because the docker-default
profile denies mount(2) regardless of CAP_SYS_ADMIN; the helper
pre-flights this and aborts early with a precise hint if any one of
the four FUSE knobs is wrong.
Default --target changed within the 2.4.0 development cycle
Earlier 2.4.0 development tags defaulted to --target /restore.
The released 2.4.0 defaults to --target /fusemount to prevent
collisions with /bin/restore and host bind-mounts. If you wired
a workflow against the old default, pass --target /restore
explicitly (or update your scripts to the new path).
It writes /var/log/last-mount-snapshot.json, supports
pre/post-mount-snapshot hooks, webhooks, mail and the
restic_mount_snapshot.prom Prometheus textfile.
2.2.x → 2.3.0¶
Purely additive. New /bin/forget-preview helper previews retention with
restic forget --dry-run using RESTIC_FORGET_ARGS.
By default it scopes the preview to the current container's HOSTNAME
and RESTIC_TAG, which is safer for repositories shared by multiple
hosts. Use --repo-wide only when you intentionally want to preview the
policy against every snapshot in the repository.
It writes /var/log/last-forget-preview.json, supports
pre/post-forget-preview hooks, webhooks, mail and Prometheus metrics.
2.2.1 → 2.2.2¶
Patch / docs release. Adds the Material for MkDocs documentation site
under docs/ and the GitHub Pages workflow. No runtime behaviour change.
2.2.0 → 2.2.1¶
Patch release. CI-only fix in app/snapshot_export.sh:
- Combined
# shellcheck disable=SC2317,SC2329on the EXIT-trapcleanup()function (false positive about unreachable code). - Explicit
copyErrorLog "${LAST_LOGFILE}" "${LAST_ERROR_LOGFILE}"call to satisfy SC2119 in newer shellcheck versions.
No runtime behaviour change. No env-var change. Drop in.
2.1.x → 2.2.0¶
Purely additive. New /bin/snapshot-export helper restores a selected
snapshot (or include-filtered subtree) into a temporary work directory and
packages it as a .tar.gz archive under /restore by default. It supports
--id, --include, --exclude, --output, --dry-run, --verify,
hooks, JSON, webhook, mail and Prometheus metrics.
2.0.x → 2.1.0¶
Purely additive. New /bin/doctor read-only diagnostics command for
support/triage. Prints release/tool versions, masked effective env, path
checks, restic cat config probe, replicate job-file validation, hook
executable status, recent last-*.json summaries and the tail of
cron.log.
docker run … doctor and docker run … /bin/doctor execute it directly
without starting cron, so it works equally well as an entrypoint and via
docker exec.
1.18.x → 2.0.0 ¶
The old "sync/bisync" surface is renamed to replicate. The runtime keeps a compatibility bridge until 3.0.0, but plan to migrate on your schedule.
What changed¶
| Old name | New name |
|---|---|
app/bisync.sh worker |
app/replicate.sh |
/bin/bisync |
/bin/replicate (with /bin/bisync symlink kept until 3.0.0) |
/var/run/bisync.lock |
/var/run/replicate.lock |
/var/log/sync-last.log |
/var/log/replicate-last.log |
/var/log/sync-error-last.log |
/var/log/replicate-error-last.log |
/var/log/sync-mail-last.log |
/var/log/replicate-mail-last.log |
/var/log/last-sync.json |
/var/log/last-replicate.json |
restic_sync.prom |
restic_replicate.prom |
/hooks/pre-sync.sh, /hooks/post-sync.sh |
/hooks/pre-replicate.sh, /hooks/post-replicate.sh |
Mail subject Sync |
Replicate |
config/sync_jobs.txt (sample) |
config/replicate_jobs.txt |
SYNC_CRON |
REPLICATE_CRON |
SYNC_JOB_FILE |
REPLICATE_JOB_FILE |
SYNC_JOB_ARGS |
REPLICATE_JOB_ARGS |
SYNC_VERBOSE |
REPLICATE_VERBOSE |
SYNC_BISYNC_CHECK_ACCESS |
REPLICATE_BISYNC_CHECK_ACCESS |
Compatibility bridge¶
/bin/bisyncis symlinked to/bin/replicateuntil 3.0.0.- All
SYNC_*env vars are read at startup and mapped to theirREPLICATE_*counterparts when the new name is unset, with a deprecation warning incron.logso you can see what is still legacy. - The rclone per-job MODE value
bisyncis unchanged. Job rows of the formSOURCE;DESTINATION;bisync,SOURCE;DESTINATION;syncandSOURCE;DESTINATION;copykeep working as-is.
What you must do¶
-
Migrate the env vars. Rename
SYNC_*→REPLICATE_*in yourdocker-compose.yml,.env, Kubernetes manifest. Helpful one-liner: -
Rename the job file (or set the env var). The installed default is now
REPLICATE_JOB_FILE=/config/replicate_jobs.txt. If you keep the old filename, set it explicitly: -
Update monitoring / scrapers that read
last-sync.jsonorrestic_sync.prom— those files no longer exist. -
Update hooks if you have
/hooks/pre-sync.sh//hooks/post-sync.sh. The runtime does not read them anymore; rename topre-replicate.sh/post-replicate.sh.
The compatibility bridge means you can do these in any order. The old SYNC_*
env var names continue to work, but each emits a deprecation warning into
cron.log until you switch over.
Removal in 3.0.0¶
In 3.0.0 (no date set yet), the bridge will be removed: /bin/bisync
symlink, SYNC_* env-var mapping and all logs/JSON/Prom names will only
respond to the replicate spelling. Plan the migration on your schedule.
1.16.x → 1.17.0¶
Purely additive. New operator-driven /bin/restore wrapper:
- Interactive on a TTY (
docker exec -ti …); flag-driven otherwise. - Mail / webhook notifications are on by default for restores (same
MAILX_RCPT,WEBHOOK_URLplumbing as the cron-driven workers). - New
/var/log/last-restore.jsonsummary, newrestic_restore_last_*Prometheus gauges, optional/hooks/{pre,post}-restore.sh.
The manual restic restore latest --target /restore invocation still works
unchanged.
1.15.x → 1.16.0¶
Purely additive — no env rename, no behaviour change in cron workers.
- SBOM generation for image builds via
SBOM=ON ./build.shwhensyftis onPATH. Release CI also uploads source-tree SBOMs on tag releases. scripts/docker-compose.ymlnow ships two opt-in Compose profiles:metrics(node-exporter sidecar) anddev(mailhog SMTP catcher). Existingdocker compose upinvocations keep starting only the main service.- Multiple backup jobs pattern at
examples/compose/multi-job.yml. - Hardening section in the README enumerates capabilities to drop and
tmpfs paths needed for
read_only: true.
1.14.x → 1.15.0¶
Purely additive.
- New opt-in
METRICS_DIRexports Prometheus textfile metrics. - New opt-in
SYNC_BISYNC_CHECK_ACCESS(nowREPLICATE_BISYNC_CHECK_ACCESS) appends--check-accessto every routine bisync run and the recovery resync. Requires the well-knownRCLONE_TESTmarker on both endpoints; rclone aborts loudly when it's missing instead of treating one side as "everything deleted". - Mail subjects gained the
[OK|FAIL N] <Job> <host> · <duration> · <details>prefix; update any subject-based filter rules. - Container logs now mask inline credentials in replicate source/destination
URLs (
mask_endpoint).
1.13.x → 1.14.0 ¶
RESTIC_TAG="" (explicitly empty) is now a hard failure with exit code
2. Pick something meaningful (daily, ${HOSTNAME}-data, …) so snapshots
can be filtered by tag later. The Dockerfile still defaults to automated,
so installs that never set RESTIC_TAG are unaffected.
Replicate job files gained optional MODE / EXTRA_ARGS columns; existing
two-column lines keep working as bisync.
1.11.x → 1.12.0+¶
Automatic restic unlock after backup / check failures is opt-in via
RESTIC_AUTO_UNLOCK=ON. The new default leaves the lock alone — safer for
repositories shared across multiple hosts where an automatic unlock could
clear another host's legitimate lock.
The restic unlock --remove-all call in /entry.sh after a failed
restic init is unaffected, because that lock can only have been created
by the failing init attempt itself.