headscale: derive tailnet DNS records from dns-updater list #544

Open
lytedev wants to merge 1 commit from headscale-extra-records into main
Owner

Summary

Replaces the abandoned #542 (which tried subnet routing and broke the LAN). Goal is still the same: let tailnet clients (notably phones on cellular using beefcake as exit node) reach internal services like git.lyte.dev.

Approach: instead of advertising the LAN subnet over tailscale, have headscale's MagicDNS return beefcake's tailnet IP for those hostnames via extra_records. Tailnet clients get 100.64.0.2, reach beefcake directly through the existing ACL, no subnet route involved. LAN clients still query the router's dnsmasq and get 192.168.0.9 as before.

Single source of truth: lyte.dns-updater.records already lists every subdomain beefcake serves on lyte.dev (those records get registered with the public DNS server via nsupdate). The headscale module derives extra_records from the same list — no second registry to keep in sync.

Wildcards like *.vpn.h are filtered out since extra_records only supports exact names.

Why not the subnet-route approach (closed #542)

Advertising 192.168.0.0/24 from beefcake, combined with beefcake also running --accept-routes, caused beefcake's own tailscaled to install the route back via tailscale0, blackholing its own LAN. DNS resolution failed (router unreachable), MagicDNS fell through to 1.1.1.1 which returned the public IP, then nothing could hairpin → git.lyte.dev was down for every host on the network. Rolled back live.

This approach has none of those failure modes since nothing reroutes — it's purely a DNS-layer answer.

Test plan

  • Deploy beefcake (nixos-rebuild switch).
  • Verify headscale picked up extra_records: sudo headscale policy get (or check services.headscale.settings.dns.extra_records rendered into config).
  • From dragon: dig git.lyte.dev should return 100.64.0.2 (tailnet IP), not 192.168.0.9 (LAN).
  • From LAN client without tailscale: dig git.lyte.dev should still return 192.168.0.9.
  • From phone on cellular, exit-node ON: https://git.lyte.dev/ loads.
  • From phone on cellular, exit-node OFF (just admindevice ACL): https://git.lyte.dev/ still loads (because the route via beefcake's tailnet IP doesn't require exit-node).
  • Spot check: a service like paperless.h.lyte.dev also works the same way from tailnet.
## Summary Replaces the abandoned #542 (which tried subnet routing and broke the LAN). Goal is still the same: let tailnet clients (notably phones on cellular using beefcake as exit node) reach internal services like `git.lyte.dev`. **Approach:** instead of advertising the LAN subnet over tailscale, have headscale's MagicDNS return beefcake's tailnet IP for those hostnames via `extra_records`. Tailnet clients get `100.64.0.2`, reach beefcake directly through the existing ACL, no subnet route involved. LAN clients still query the router's dnsmasq and get `192.168.0.9` as before. **Single source of truth:** `lyte.dns-updater.records` already lists every subdomain beefcake serves on `lyte.dev` (those records get registered with the public DNS server via nsupdate). The headscale module derives `extra_records` from the same list — no second registry to keep in sync. Wildcards like `*.vpn.h` are filtered out since `extra_records` only supports exact names. ## Why not the subnet-route approach (closed #542) Advertising `192.168.0.0/24` from beefcake, combined with beefcake also running `--accept-routes`, caused beefcake's own tailscaled to install the route back via `tailscale0`, blackholing its own LAN. DNS resolution failed (router unreachable), MagicDNS fell through to `1.1.1.1` which returned the public IP, then nothing could hairpin → `git.lyte.dev` was down for every host on the network. Rolled back live. This approach has none of those failure modes since nothing reroutes — it's purely a DNS-layer answer. ## Test plan - [ ] Deploy beefcake (`nixos-rebuild switch`). - [ ] Verify headscale picked up extra_records: `sudo headscale policy get` (or check `services.headscale.settings.dns.extra_records` rendered into config). - [ ] From dragon: `dig git.lyte.dev` should return `100.64.0.2` (tailnet IP), not `192.168.0.9` (LAN). - [ ] From LAN client without tailscale: `dig git.lyte.dev` should still return `192.168.0.9`. - [ ] From phone on cellular, exit-node ON: `https://git.lyte.dev/` loads. - [ ] From phone on cellular, exit-node OFF (just admindevice ACL): `https://git.lyte.dev/` still loads (because the route via beefcake's tailnet IP doesn't require exit-node). - [ ] Spot check: a service like `paperless.h.lyte.dev` also works the same way from tailnet.
fix(beefcake/syncthing): retry syncthing-init on failure
All checks were successful
/ check-format (push) Successful in 7s
/ build (push) Successful in 7s
5bcfeeb97e
syncthing-init is a oneshot that pushes declarative settings into the
running syncthing instance. If syncthing.service restarts mid-run (e.g.
during a NixOS rebuild), the init is SIGTERM'd and, with no Restart=,
stays failed until someone manually resets it. This was responsible for
a stale OpenObserve systemd_unit_failed alert on beefcake that had been
ringing since 2026-05-12.

Add Restart=on-failure with a 30s backoff so a transient restart
self-heals.
headscale: derive tailnet DNS records from dns-updater list
All checks were successful
/ check-format (push) Successful in 8s
/ build (push) Successful in 5m59s
982263cdae
Phones on cellular using beefcake as exit node couldn't reach
internal-only services that DNS resolved to LAN IPs (git.lyte.dev →
192.168.0.9). The previous attempt — advertising 192.168.0.0/24 as a
subnet route — broke beefcake's own LAN access because beefcake itself
runs --accept-routes and installed its own advertised route back via
tailscale0, blackholing the router and cascading DNS failures
everywhere.

Instead: have headscale return beefcake's tailnet IP for those
hostnames via MagicDNS's extra_records. Tailnet clients now resolve
git.lyte.dev → 100.64.0.2 directly, reach beefcake via the existing
admindevice → *:* ACL, and never need a subnet route. LAN clients
keep going through the router's dnsmasq → 192.168.0.9 unchanged.

Single source of truth: lyte.dns-updater.records already lists every
subdomain beefcake serves on lyte.dev (registered with the public DNS
server via nsupdate). The headscale config derives extra_records from
the same list — no second registry to keep in sync. Wildcards like
*.vpn.h are filtered out since extra_records only supports exact names.
All checks were successful
/ check-format (push) Successful in 8s
Required
Details
/ build (push) Successful in 5m59s
Required
Details
This pull request can be merged automatically.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin headscale-extra-records:headscale-extra-records
git switch headscale-extra-records
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lytedev/nix!544
No description provided.