5.1 KiB
date | title | draft | toc |
---|---|---|---|
2023-07-06T14:32:00-05:00 | Backups with Restic | true | false |
For the longest time, my backup setup has been a script I run manually that
was quite dumb that had no features other than encryption. After
getting my feet wet with btrfs
somewhat recently and seeing the magic of
deduplication, compression, and snapshots, I was all-in on these features and
also wanted them for my backups.
TL;DR
- Install
restic
on both machines (may only be needed on the backupper?) - Create a
restic
user on the backuppee, setup a directory for backups with the right permissions, and add the backupper's public key restic -r sftp:restic@backuppee:/backups init
to setup a repo, put the password in a secret place accessible only to the backupper userfor d in $DIRS; do RESTIC_PASSWORD_COMMAND="load secret restic-key" restic -r sftp:restic@backuppee:/backups "$d"; done
Planning
The most important thing to think about when it comes to backups is to think about what you are protecting. It's easy enough to just backup everything and I know plenty of folks that do this! However, I'm not that type. I like to keep things pretty minimal, so I'll evaluate which things truly are worth backing up:
In my case, the only things I really want to back up are anything that might be considered unique data that cannot be easily reproduced such as the following:
- Family photos and videos
- Secrets, keys, and anything else that provides access to other important assets
- Communications and their context (emails, texts, etc.)
- Backups of devices for restoration in case of bricking (phones and nintendo consoles come to mind)
- Source code for projects
My current solutions for these are varied:
- Family Pictures: Google Photos
- I would love to possibly look into a self-hosted solution, but Google Photos is unfortunately too easy and cheap
- Secrets: These go into a combination of
password-store
and Vaultwarden.- The
password-store
database is backed up pretty much automatically viagit
to a handful of places and the data is encrypted at rest. - The Vaultwarden database is part of the "mission critical" server backup
that happens manually. These backups are untested, but anything in here
should ultimately be recoverable from redundancies in the
password-store
database via "forgot my password" mechanisms.
- The
- Communications: These are handled by whatever cloud services the communications happen over. Email is all Gmail at the moment, chats are varied, etc.
- Device Backups: These have been simple enough. Copy the backups to the various backup locations and forget about them.
- Code: I have pretty good
git
habits after almost 15 years of version control'ing, so I push code all the time to a server which in-turn backs up the code.
So where am I putting this data? I have a few larger disks here at my house, but I also host a sprinkling of machines at friends and family's houses out of the way and we share space as needed, allowing all of us to have redundant remote backups. That said, my machines there are not the most robust. Here are things I'm concerned about:
- Running out of space
- No deduplication means this will happen eventually.
- Bitrot
- They say it's rare, and perhaps I'm confusing disk damage with bitrot, but I definitely have been bit by this in some form or another. I want my backup system to combat this as much as possible (checksums and error correction via btrfs) but also to somehow regularly and automatically let me know if and when it occurs
- Not automated
- I would have a lot more peace-of-mind if I knew I could just backup everything nightly and not worry about it.
Backing up everything nightly was not an option currently, since I have ~1TB of data backed up and I currently just sync over everything in the local backup directory via rsync. I know, I've probably got the wrong flags, since rsync should be just fine for this, but I also wanted deduplication and a system that would let me pull out individual files if I wanted.
Enter Restic
Restic pretty much seemed perfect. Seemed simple enough to setup and manage, so I gave it a shot
My current goals are certainly "good enough", but the lack of automation and terribly inefficient use of bandwidth with my remote hosts was not ideal:
Setup
I aimed to keep things pretty secure, so I setup a user specifically for my backups on my backuppee devices, the machines with big hard disks (calm down) that would hold the backed-up data.
My backupper machines would be the ones pushing data, which, for now, was really just one box. My server which houses the actually important data.
All my other machines really just interface with that server to push and pull code and data via git. Pretty much anything else should be lose-able in my situation. I use Google Photos for backing up pictures, so I don't really worry about those for now. The only other data I want backed up is older backups (giant tarballs).