k8s-clusters/home/readme.md
2022-07-18 13:50:10 -05:00

115 lines
3.4 KiB
Markdown

# Home Cluster
> **NOTE**: Scripts below are in `fish` shell.
## TODO
- **Netboot**: https://www.sidero.dev/v0.5/getting-started/prereq-dhcp/
- Can probably leverage `dnsmasq` on the router for this?
## Setup
### Networking
- Prepare networking
- Internally:
- Add a DNS entry for the cluster endpoint (router's `/etc/hosts` + `dnsmasq`) to point to the initial node
- Externally:
- Add a DNS entry for the cluster endpoint to point to the router
- Setup the router to forward external requests to the initial node
### Setup Kubernetes Cluster
> **Source**: https://www.talos.dev/v1.1/introduction/getting-started/
```bash
#!/usr/bin/env fish
# these are my values, you will want your own
set CLUSTER_NAME 'home'
set CLUSTER_ENDPOINT 'https://kube-cluster.home.lyte.dev:6443'
set NODE_ADDR '10.0.0.101'
set AGE_KEY (pass age-key | rg '# public key: ' | awk '{printf $4}')
```
- Setup talos directory if needed
- `mkdir -p talos; cd talos`
- If you are not using _this_ configuration:
- `talosctl gen config "$CLUSTER_NAME" "$CLUSTER_ENDPOINT"`
- Edit files as needed, making sure only one of the controlplane nodes is the `endpoint` in the `talosconfig`
- `mv talosconfig talosconfig.yaml`
- Encrypt via `sops` with `age`
- `for f in *; sops yaml --encrypt --age-key "$AGE_KEY" --in-place "$f"; end`
- Setup the `talosctl` client to use your configuration
- `sops exec-file talosconfig.yaml 'talosctl config merge {}'`
- For each node in the cluster as specified in `talosconfig.yaml`, do the
following:
- Boot the Talos image on the node
- Disconnect boot media from the node after it's booted otherwise your
Ventoy will get wiped
- Apply the appropriate configuration to the node
- `sops exec-file (controlplane.yml|worker.yml) 'talosctl apply-config --insecure --nodes '"$NODE_ADDR"' --file {}'`
- This can take a moment to finish, but you can move on to the next node
while you wait
- Bootstrap the cluster
- `talosctl bootstrap --nodes "$NODE_ADDR"`
- You will need to wait a bit for Kubernetes to initialize
- Pull down the kubeconfig
- `talosctl kubeconfig`
Once the cluster has finished initializing _and starting up_, you should be
able to `kubectl get nodes`.
#### Adding Nodes
> **TODO**: This process is untested!
- Boot the Talos image on the target node
- Add the node to `talosconfig.yaml`
- `sops talos/talosconfig.yaml`
- Setup the `talosctl` client to use your configuration
- `sops exec-file talos/talosconfig.yaml 'talosctl config merge {}'`
- Apply the appropriate configuration to all nodes in the cluster
#### Removing Nodes
> **TODO**: This process is untested!
- Cordon and drain the node
- Remove the node from `talosconfig.yaml`
- `sops talos/talosconfig.yaml`
- Update the `talosctl` client to use your configuration
- `sops exec-file talos/talosconfig.yaml 'talosctl config merge {}'`
- Apply the appropriate configuration to all nodes in the cluster
- Power down the node
#### Untaint Masters
Since we're "frugal" (cheap) and we want to use all the hardware for all the
things:
```bash
kubectl taint nodes --all node-role.kubernetes.io/master-
```
### Apply Initial Manifests
Some manifests must be applied before we can let GitOps take over.
```bash
kubectl apply -k --enable-helm manifests/initialization
```
### Setting up GitOps
**TODO**
### Storage
**TODO**
## Load Balancing
I can _probably_ handle this with my router?
**TODO**