Documentation for my home lab CI/CD setup
This page documents issues, problems, and maintenance activities that occur during the operation of the home lab. Each entry includes the date, problem description, diagnosis steps, resolution, and follow-up notes.
Monitoring the Ryzen NAS after the Global C-States fix was applied to ensure the freeze/sleep issues were fully resolved.
N/A
After several tests and a long period of uptime, system stability has been confirmed. The C-States fix successfully prevented the system from entering unrecoverable sleep states.
The system would occasionally enter a sleep state from which it could not awake, requiring a hard reset. This is a known issue with first-generation Ryzen processors and Linux.
Investigated system logs leading up to freezes. Identified that the freezes occurred during idle periods, pointing towards power management and C-States.
Disabled Global C-States in the BIOS/UEFI settings, which prevents the CPU from entering the low-power sleep states that cause the system to hang.
The NAS (Casper) needed more computing power for handling all tasks including CI/CD and self-hosted services.
N/A
Rebuilt the NAS hardware from an old AMD A10-7700K processor to an AMD Ryzen 1700 (repurposed old hardware) on AM4 using 16GB of DDR4 (previously 8GB of DDR3). Upgraded the OS drive from a 128GB SATA SSD to a 256GB NVMe SSD, and added two 1TB HDDs in a new ZFS pool called ‘casper-buffer’. During the hardware upgrade, upgraded from OMV7 to OMV8. Successfully exported and reimported the ZFS pools, and reassigned/setup the Gitea server. Added a local Gitea runner called ‘casper-runner’ to handle all CI/CD tasks with the new hardware capabilities.
Devices using Pi-hole DNS lost internet connectivity. The Pi-hole host could communicate with local network devices but could not reach external addresses such as 1.1.1.1. Investigation showed the system had a static IPv4 address configured via NetworkManager (nmcli), but no IPv4 gateway was defined. As a result, no default route existed and the system could not reach upstream DNS servers.
Confirmed Pi-hole host had a valid local IP address (192.168.1.125) on interface eth0.
ping 192.168.1.1
Result: Successful.
ping 1.1.1.1
Result: Failed.
ip route
Output showed only the local network route:
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.125 metric 100
The expected default route (default via 192.168.1.1) was missing.
sudo ip route add default via 192.168.1.1
Identified root cause: the NetworkManager connection used manual IPv4 configuration but did not define an ipv4.gateway value.
nmcli connection modify ipv4.gateway 192.168.1.1
nmcli connection up </code></pre>
Performed a Pi-hole gravity update and added DNS blocklists to prevent Meta Quest Pro devices from contacting Meta services, including firmware update and advertising endpoints. The Meta Quest Pro is discontinued, and updates are intentionally blocked at the DNS level.
pihole -g
https://raw.githubusercontent.com/ibrah3m/pihole-meta-quest-blocklist/main/hosts-firmware
https://raw.githubusercontent.com/ibrah3m/pihole-meta-quest-blocklist/main/hosts-ads
Ensured ENLIL, LORIC, AUREL, and Casper Raspberry Pi nodes are using the ENLIL Pi-hole as their DNS server directly on the devices rather than relying on the router settings.
nmcli device show
sudo nmcli connection modify connection ipv4.dns "192.168.1.125" ipv4.ignore-auto-dns yes ipv6.ignore-aut-dns yes ipv4.dns "2a00:23c7:593:6501:ba27:ebff:fe0f:e3f2"
nmcli connection show target_name
Changed the DNS resolving on Raspberry Pi nodes to target the ENLIL Pi-hole hosted on 192.168.1.125.
nmcli connection show
sudo nmcli connection modify BT-RJATFG ipv4.dns "192.168.1.125" ipv4.ignore-auto-dns yes && sudo nmcli connection up BT-RJATFG && nmcli connection show BT-RJATFG
Noted that our home network is on BT, which prevents directly overriding DNS for all devices. Devices currently need to be manually pointed to the Pi-hole.
docker run --rm alpine cat /etc/resolv.conf
This ensures that containers respect the Pi-hole DNS.
Deployment of Pi-hole on ENLIL (Raspberry Pi 1 Model B) as a dedicated network DNS and filtering node.
Verified Pi-hole operation via web interface and query logs.
Findings:
Installed and configured Pi-hole on ENLIL.
Added blocklists sourced from Firebog
Ran a manual Gravity update using the Pi-hole web interface.
Added manual blocking for Facebook-related domains.
Set upstream DNS provider to Cloudflare (temporary).
The /etc/hosts file was being reset after system reboots, losing custom DNS resolutions including 192.168.1.124 entries.
Investigated the cause of hosts file resetting:
cat /etc/cloud/cloud.cfg
cat /etc/cloud/templates/hosts.debian.tmpl
Findings: Cloud-init was managing /etc/hosts and overwriting custom entries on each boot.
Disabled cloud-init management of /etc/hosts and updated the template:
sudo nano /etc/cloud/cloud.cfg
Set: manage_etc_hosts: false
sudo nano /etc/cloud/templates/hosts.debian.tmpl
sudo reboot
The /etc/hosts file now retains custom entries across reboots.
LORIC (Raspberry Pi 3 B+ orchestration node) did not respond after a reboot. The device was unreachable over the network, and the Gitea runner service on it was offline.
ip a
ifconfig wlan0
Findings:
sudo raspi-config
sudo nano /etc/hosts
Added entries for all relevant nodes.
sudo systemctl restart gitea-runner
sudo systemctl status gitea-runner
LORIC reconnected to the network and the runner service resumed normal operation.