Sources report¶
/bin/sources-report is a read-only pre-flight inventory of the
paths your backup will actually read. It answers four questions in one
go, before BACKUP_CRON ever ticks:
- Is
BACKUP_ROOT_DIRmounted, readable, and the kind of thing I think it is? - Are the
--files-fromfiles referenced inRESTIC_JOB_ARGSreadable, non-empty, and do their listed entries actually exist on disk inside the container? - Are the
--exclude-filefiles readable, and how many pattern lines are in each? - Roughly how big is the data set (file count + bytes)? — so the estimated backup window stays a planning number rather than a surprise.
It is operator-initiated and never cron-driven by itself.
Why it exists¶
Backups silently widen or narrow when:
- a bind-mount disappears (
BACKUP_ROOT_DIRbecomes an empty directory and the next backup looks "fine, just smaller"), - a
--files-fromfile accumulates stale entries (rows that point at paths the operator removed months ago — those backups silently skip parts of the dataset), - an
--exclude-fileis mistyped inRESTIC_JOB_ARGSand restic silently treats the missing pattern source as "no excludes", - a freshly cloned host inherits a list of paths that don't exist there yet.
/bin/doctor catches these as boolean
readable / not-readable / missing warnings. /bin/sources-report
goes one step further: it tells you how much data is behind each
source, so a --files-from typo that drops 80 % of the dataset is
visible at a glance instead of "discovered" three retention cycles
later.
Quick start¶
# Default scope: BACKUP_ROOT_DIR + every --files-from / --exclude-file
# discovered in RESTIC_JOB_ARGS.
docker exec -ti restic-backup-helper /bin/sources-report
# Fast mode for slow / remote sources (NFS, SFTP, cloud mounts):
# skip `du -sk` and file-count, only check readability + type.
docker exec -ti restic-backup-helper /bin/sources-report --no-size
# Ad-hoc inspection of an extra path that isn't in BACKUP_ROOT_DIR /
# RESTIC_JOB_ARGS yet. Repeatable.
docker exec -ti restic-backup-helper /bin/sources-report \
--source /mnt/new-dataset --source /mnt/backup-staging
# One-shot via docker run (no cron startup) — useful in CI smoke jobs.
docker run --rm \
--env-file restic.env \
-v /srv/documents:/data:ro \
-v ./config:/config:ro \
marc0janssen/restic-backup-helper:latest \
sources-report --no-size
Flags¶
| Flag | Default | Purpose |
|---|---|---|
--source PATH |
– | Add an ad-hoc path to inspect. Repeatable. By default the report covers BACKUP_ROOT_DIR plus every --files-from referenced in RESTIC_JOB_ARGS. |
--files-from FILE |
– | Add a --files-from file to inspect. Repeatable. By default the report scans RESTIC_JOB_ARGS for them. |
--no-size |
off | Skip size estimation (du -sk + find -type f). Only check readability, type and the line counts of --files-from / --exclude-file. Recommended on slow / remote sources. |
--depth N |
unlimited | Cap directory traversal depth at N levels when counting files. No effect with --no-size. Useful on very large trees. |
--help |
– | Print usage and exit. |
What it does¶
flowchart TD
A[sources-report] --> B[pre-sources-report hook]
B --> C{Compose scope}
C --> C1[BACKUP_ROOT_DIR<br/>+ --source flags]
C --> C2[--files-from in RESTIC_JOB_ARGS<br/>+ --files-from flags]
C --> C3[--exclude-file in RESTIC_JOB_ARGS]
C1 --> D{per source}
D --> D1[stat: type<br/>readable?]
D1 --> D2{--no-size?}
D2 -- no --> D3[du -sk + find -type f]
D2 -- yes --> D4[skip sizing]
C2 --> E{per files-from}
E --> E1[readable?<br/>count non-comment lines<br/>check each entry exists]
C3 --> F{per exclude-file}
F --> F1[readable?<br/>count pattern lines]
D3 --> G[Aggregate totals]
D4 --> G
E1 --> G
F1 --> G
G --> H[Render log table]
H --> I[Write last-sources-report.json<br/>with nested sources / files_from / exclude_files arrays]
I --> J[Optional restic_sources_report.prom]
J --> K{MAILX_RCPT? WEBHOOK_URL?}
K --> L[mail + webhook]
L --> M[post-sources-report hook with "$rc"]
Scope discovery¶
The default scope mirrors exactly what /bin/backup would feed
restic:
sources = BACKUP_ROOT_DIR (if set)
+ each --source PATH from the CLI
files_from = every --files-from / --files-from-verbatim /
--files-from-raw value found in RESTIC_JOB_ARGS
+ each --files-from FILE from the CLI
exclude_files = every --exclude-file / --iexclude-file value found in
RESTIC_JOB_ARGS
So the report is grounded in the same env-driven contract as the backup itself; you don't have to keep two lists in sync.
Estimate semantics¶
- Size figure is unfiltered.
du -skis run on each source without applying restic's exclude rules. The report lists exclude files separately so you can reason about expected exclusions yourself; the helper deliberately does not re-implement restic's matcher. du -skis POSIX-portable. Works on both BusyBoxdu(Alpine default) and GNUcoreutilsdu. Kilobytes are multiplied by1024to get the bytes figure in the JSON.- File count uses
find -type f. Symlinks pointing at directories are not followed (matches what restic actually stores). --depth Ncapsfinddepth. Useful on huge trees, at the cost of an under-count for deeper hierarchies — the JSONfilesfield reflects the cap, not the true number.--no-sizereports-1forfilesandbytes. The JSON encodes-1rather than0so consumers can distinguish "skipped" from "really empty".
Audit trail¶
The helper writes:
/var/log/sources-report-last.log— human-readable table + per-source / per-files-from / per-exclude-file details./var/log/sources-report-error-last.log— copied on failure./var/log/last-sources-report.json— flat aggregates plus nestedsources,files_from,exclude_filesarrays.restic_sources_report.prom— whenMETRICS_DIRis configured.
Hooks:
/hooks/pre-sources-report.sh # informational; failure does not abort the report
/hooks/post-sources-report.sh "$exit_code" # always called with the report exit code as $1
Mail and webhook notifications use the same MAILX_* and WEBHOOK_*
settings as the cron-driven workers.
JSON summary¶
In addition to the common fields (job, hostname, release,
started_at, finished_at, duration_seconds, exit_code):
| Field | Type | Description |
|---|---|---|
backup_root_dir |
string | Value of BACKUP_ROOT_DIR at report time. |
sources_count |
integer | Number of unique source paths inspected (BACKUP_ROOT_DIR + --source). |
files_from_count |
integer | Number of --files-from files inspected. |
exclude_files_count |
integer | Number of --exclude-file files inspected. |
total_files |
integer | Sum of find -type f counts across all sized sources. 0 when --no-size was set. |
total_bytes |
integer | Sum of du -sk * 1024 across all sized sources. 0 when --no-size was set. |
errors_count |
integer | Count of unreadable / missing entries (sources, --files-from files themselves, --files-from line contents, --exclude-file files). |
no_size |
string | ON / OFF mirroring --no-size. |
depth_limit |
string | --depth N value, or unlimited. |
sources |
array | Per-source detail: {path, readable, type, files, bytes}. files / bytes are -1 when --no-size was set. |
files_from |
array | Per-file detail: {path, readable, lines, missing_entries}. |
exclude_files |
array | Per-file detail: {path, readable, patterns}. |
Example:
{
"job": "sources-report",
"hostname": "backup-node",
"release": "2.14.1-0.18.1",
"started_at": "2026-05-13T15:30:00+0200",
"finished_at": "2026-05-13T15:30:08+0200",
"duration_seconds": 8,
"exit_code": 0,
"backup_root_dir": "/data",
"sources_count": 2,
"files_from_count": 1,
"exclude_files_count": 1,
"total_files": 18247,
"total_bytes": 9876543210,
"errors_count": 0,
"no_size": "OFF",
"depth_limit": "unlimited",
"sources": [
{"path": "/data", "readable": true, "type": "directory", "files": 18247, "bytes": 9876543210}
],
"files_from": [
{"path": "/config/files-from.txt", "readable": true, "lines": 4, "missing_entries": 0}
],
"exclude_files": [
{"path": "/config/excludes.txt", "readable": true, "patterns": 12}
]
}
Exit codes¶
| Exit | Meaning |
|---|---|
0 |
Report produced. Individual unreadable entries are signalled via errors_count in the JSON (and the log) but do not by themselves trigger a non-zero exit. |
2 |
Configuration error: no sources to inspect (BACKUP_ROOT_DIR empty, no --source, no --files-from discovered or passed), or invalid --depth. |
The "always exit 0 even with unreadable entries" choice is
intentional: this is a report, not a gate. Wire alerts on
restic_sources_report_last_errors_count > 0 if you want loud CI
behaviour.
When to use it¶
- Before configuring a new backup, especially when you reuse an
existing
--files-fromfrom another host or a documentation template. - After editing
RESTIC_JOB_ARGSin your env file. - As a CI smoke step. Cheap pre-flight: catches missing mounts,
unreadable secrets, silently-empty
--files-fromfiles, and size deltas the next backup will produce. Combine with/bin/doctorfor a complete pre-flight bundle. - After a host migration or a Docker Compose reshuffle, to confirm bind-mounts still resolve where the env says they should.
See also¶
- Backup worker — same
BACKUP_ROOT_DIR/RESTIC_JOB_ARGScontract; this helper inspects it without taking a lock. - Forget preview — same operator-driven pattern, for retention policy.
- Diagnostics (doctor) — overlapping path checks at a coarser granularity; doctor reports readability without sizes.
- JSON summaries — schema for
last-sources-report.json. - Prometheus metrics —
restic_sources_report_*gauges. - Hooks —
pre-sources-report/post-sources-reportregistration.