diff --git a/content/blog/restic-backups.md b/content/blog/restic-backups.md new file mode 100644 index 0000000..8bccd3a --- /dev/null +++ b/content/blog/restic-backups.md @@ -0,0 +1,115 @@ +--- +date: "2023-07-06T14:32:00-05:00" +title: "Backups with Restic" +draft: true +toc: false +--- + +# TL;DR + +- Install `restic` on both machines (may only be needed on the backupper?) +- Create a `restic` user on the backuppee, setup a directory for backups with + the right permissions, and add the backupper's public key +- `restic -r sftp:restic@backuppee:/backups init` to setup a repo, put the + password in a secret place accessible only to the backupper user +- `for d in $DIRS; do RESTIC_PASSWORD_COMMAND="load secret restic-key" restic -r sftp:restic@backuppee:/backups "$d"; done` + + + +# Intro + +For the longest time, my backup setup has been [a script I run manually that +was quite dumb][backupify] that had no features other than encryption. After +getting my feet wet with `btrfs` somewhat recently and seeing the magic of +deduplication, compression, and snapshots, I was all-in on these features and +also wanted them for my backups. + +I also had a friend that had been using `btrfs` snapshots for sometime and I was super impressed with the simplicity of his setup. It made me want to improve mine! + +# Planning + +The most important thing to think about when it comes to backups is to think +about what you are protecting. It's easy enough to just backup _everything_ and +I know plenty of folks that do this! However, I'm not that type. I like to keep +things pretty minimal, so I'll evaluate which things truly are worth backing up: + +In my case, the only things I really want to back up are anything that might be +considered unique data that cannot be easily reproduced such as the following: + +- Family photos and videos +- Secrets, keys, and anything else that provides access to other important + assets +- Communications and their context (emails, texts, etc.) +- Backups of devices for restoration in case of bricking (phones and nintendo + consoles come to mind) +- Source code for projects + +My current solutions for these are varied: + +- **Family Pictures**: Google Photos + - I would love to possibly look into a self-hosted solution, but Google Photos + is unfortunately too easy and cheap +- **Secrets**: These go into a combination of `password-store` and Vaultwarden. + - The `password-store` database is backed up pretty much automatically via + `git` to a handful of places and the data is encrypted at rest. + - The Vaultwarden database is part of the "mission critical" server backup + that happens manually. These backups are untested, but anything in here + should ultimately be recoverable from redundancies in the `password-store` + database via "forgot my password" mechanisms. +- **Communications**: These are handled by whatever cloud services the + communications happen over. Email is all Gmail at the moment, chats are + varied, etc. +- **Device Backups**: These have been simple enough. Copy the backups to the + various backup locations and forget about them. +- **Code**: I have pretty good `git` habits after almost 15 years of version + control'ing, so I push code all the time to a server which in-turn backs up + the code. + +So where am I putting this data? I have a few larger disks here at my house, but +I also host a sprinkling of machines at friends and family's houses out of the +way and we share space as needed, allowing all of us to have redundant remote +backups. That said, my machines there are not the most robust. Here are things I'm concerned about: + +- Running out of space + - No deduplication means this _will_ happen eventually. +- Bitrot + - They say it's rare, and perhaps I'm confusing disk damage with bitrot, but + I definitely have been bit by this in some form or another. I want my backup + system to combat this as much as possible (checksums and error correction + via btrfs) but also to somehow regularly and automatically let me know if + and when it occurs +- Not automated + - I would have a lot more peace-of-mind if I knew I could just backup + everything nightly and not worry about it. + +Backing up everything nightly was not an option currently, since I have ~1TB of +data backed up and I currently just sync over everything in the local backup +directory via rsync. I know, I've probably got the wrong flags, since rsync +should be just fine for this, but I also wanted deduplication and a system that +would let me pull out individual files if I wanted. + +# Enter Restic + +Restic pretty much seemed perfect. Seemed simple enough to setup and manage, so I gave it a shot + + My current goals are certainly "good enough", but the lack of automation and terribly inefficient use of bandwidth with my remote hosts was _not_ ideal: + + + + +# Setup + +I aimed to keep things pretty secure, so I setup a user specifically for my +backups on my backuppee devices, the machines with big hard disks (calm down) +that would hold the backed-up data. + +My backupper machines would be the ones pushing data, which, for now, was really +just one box. My server which houses the actually important data. + +All my other machines really just interface with that server to push and pull +code and data via git. Pretty much anything else should be lose-able in my +situation. I use Google Photos for backing up pictures, so I don't really worry +about those for now. The only other data I want backed up is older backups +(giant tarballs). + +[backupify]: https://git.lyte.dev/lytedev/dotfiles/src/commit/afed5fa6bbb6bc2733a3cadcb940d6599823b59d/common/bin/backupify