Systems involved
| System | Role |
|---|---|
| Vendor advisory | Source announcement and firmware image. |
| TR-069 ACS (GenieACS) | Firmware distribution to CPE fleet. |
| Studio inventory | CPE records tagged by region. |
| Twilio SMS | Subscriber pre-maintenance notice. |
| Gmail | Business-tier subscriber email comms. |
| Atlassian Statuspage | Maintenance windows published. |
| Splynx | Subscriber region and contact lookup. |
| LibreNMS | Reboot and reachability verification. |
Slack #cpe-fleet | Operational channel. |
Walkthrough
Verify and stage the image
Copilot downloads the firmware from the vendor advisory, validates the SHA against the published hash, and uploads it to the ACS repository. The image appears in the ACS catalogue with the advisory ID.
Plan the rollout
Split the fleet by region — 14 regions, roughly 1,000 CPEs each. Each region gets a 2-hour window spread over ten nights. Business subscribers are scheduled last so any issues surface on residential first.
Pre-window subscriber comms
48 hours before each regional window, Twilio sends an SMS to residential subscribers: brief outage, window, self-service URL for status. Business subscribers get a personalised email through Gmail.
Publish maintenance
Statuspage publishes all 14 maintenance windows with the affected regions and the advisory reference. The IVR hold message picks up an automated region-aware notice 30 minutes before each window.
Execute the first window
The
CPE firmware rollout procedure targets Region 1, 10 percent of the CPEs at a time. For each batch, the ACS queues the upgrade. Copilot watches the CPEs come back online against LibreNMS reachability within the expected reboot interval.Watch the success rate
Target threshold is 99.5 percent reboot-and-reauth within the window. Region 1 hits 99.7 percent. Seven CPEs didn’t come back — Copilot flags each one with the last-known state and queues them for individual attention.
Handle the stragglers
For each failed CPE, Copilot pulls the RADIUS last-accounting record, the ACS session history, and the LibreNMS last-seen. Five come back on the next day’s reboot. Two are dispatched for field swap.
Subsequent regions
Each following night, the rollout procedure runs for the next region. Statuspage and
#cpe-fleet maintain a running status board. Residential complaints are near-zero because the comms went out ahead.Where Studio earns its keep
- The rollout is gated — each region only starts when the previous region hits the success threshold, so the first problem is caught on 1,000 subscribers, not 14,000.
- The SMS, the email, and the status page all point at the same regional schedule — there is no gap between “when we said” and “when it happened.”
- Failed CPEs are handled individually from the same workspace with the full history available, not marked as errors in a report for someone else to chase next Tuesday.
- The runbook is parameterized by region, so the next advisory from any CPE vendor reuses the same structure.
Related
Procedures
CPE firmware rollout with advisory ID and region as arguments.Connectors and MCP
GenieACS, Twilio, and Splynx wired as connectors.