site.lyte.dev/content/blog/restic-backups.md

112 lines
5.1 KiB
Markdown
Raw Permalink Normal View History

2023-07-10 12:07:48 -05:00
---
date: "2023-07-06T14:32:00-05:00"
title: "Backups with Restic"
draft: true
toc: false
---
For the longest time, my backup setup has been [a script I run manually that
was quite dumb][backupify] that had no features other than encryption. After
getting my feet wet with `btrfs` somewhat recently and seeing the magic of
deduplication, compression, and snapshots, I was all-in on these features and
also wanted them for my backups.
<!--more-->
2023-07-10 12:07:48 -05:00
# TL;DR
- Install `restic` on both machines (may only be needed on the backupper?)
- Create a `restic` user on the backuppee, setup a directory for backups with
the right permissions, and add the backupper's public key
- `restic -r sftp:restic@backuppee:/backups init` to setup a repo, put the
password in a secret place accessible only to the backupper user
- `for d in $DIRS; do RESTIC_PASSWORD_COMMAND="load secret restic-key" restic -r sftp:restic@backuppee:/backups "$d"; done`
# Planning
The most important thing to think about when it comes to backups is to think
about what you are protecting. It's easy enough to just backup _everything_ and
I know plenty of folks that do this! However, I'm not that type. I like to keep
things pretty minimal, so I'll evaluate which things truly are worth backing up:
In my case, the only things I really want to back up are anything that might be
considered unique data that cannot be easily reproduced such as the following:
- Family photos and videos
- Secrets, keys, and anything else that provides access to other important
assets
- Communications and their context (emails, texts, etc.)
- Backups of devices for restoration in case of bricking (phones and nintendo
consoles come to mind)
- Source code for projects
My current solutions for these are varied:
- **Family Pictures**: Google Photos
- I would love to possibly look into a self-hosted solution, but Google Photos
is unfortunately too easy and cheap
- **Secrets**: These go into a combination of `password-store` and Vaultwarden.
- The `password-store` database is backed up pretty much automatically via
`git` to a handful of places and the data is encrypted at rest.
- The Vaultwarden database is part of the "mission critical" server backup
that happens manually. These backups are untested, but anything in here
should ultimately be recoverable from redundancies in the `password-store`
database via "forgot my password" mechanisms.
- **Communications**: These are handled by whatever cloud services the
communications happen over. Email is all Gmail at the moment, chats are
varied, etc.
- **Device Backups**: These have been simple enough. Copy the backups to the
various backup locations and forget about them.
- **Code**: I have pretty good `git` habits after almost 15 years of version
control'ing, so I push code all the time to a server which in-turn backs up
the code.
So where am I putting this data? I have a few larger disks here at my house, but
I also host a sprinkling of machines at friends and family's houses out of the
way and we share space as needed, allowing all of us to have redundant remote
backups. That said, my machines there are not the most robust. Here are things I'm concerned about:
- Running out of space
- No deduplication means this _will_ happen eventually.
- Bitrot
- They say it's rare, and perhaps I'm confusing disk damage with bitrot, but
I definitely have been bit by this in some form or another. I want my backup
system to combat this as much as possible (checksums and error correction
via btrfs) but also to somehow regularly and automatically let me know if
and when it occurs
- Not automated
- I would have a lot more peace-of-mind if I knew I could just backup
everything nightly and not worry about it.
Backing up everything nightly was not an option currently, since I have ~1TB of
data backed up and I currently just sync over everything in the local backup
directory via rsync. I know, I've probably got the wrong flags, since rsync
should be just fine for this, but I also wanted deduplication and a system that
would let me pull out individual files if I wanted.
# Enter Restic
Restic pretty much seemed perfect. Seemed simple enough to setup and manage, so I gave it a shot
My current goals are certainly "good enough", but the lack of automation and terribly inefficient use of bandwidth with my remote hosts was _not_ ideal:
# Setup
I aimed to keep things pretty secure, so I setup a user specifically for my
backups on my backuppee devices, the machines with big hard disks (calm down)
that would hold the backed-up data.
My backupper machines would be the ones pushing data, which, for now, was really
just one box. My server which houses the actually important data.
All my other machines really just interface with that server to push and pull
code and data via git. Pretty much anything else should be lose-able in my
situation. I use Google Photos for backing up pictures, so I don't really worry
about those for now. The only other data I want backed up is older backups
(giant tarballs).
[backupify]: https://git.lyte.dev/lytedev/dotfiles/src/commit/afed5fa6bbb6bc2733a3cadcb940d6599823b59d/common/bin/backupify