Skip to main content
PRTG sends a red sensor for “M365 Auth Latency” against three different customer probes within five minutes. The on-call MSP engineer needs to know if it’s the customers’ networks or Microsoft — and if it’s Microsoft, get one consistent message in front of every customer before the phones start ringing.

Systems involved

SystemRole
PRTGSource alarm and sensor history.
Studio diagnosticsPing, traceroute, DNS, and HTTPS path checks against outlook.office365.com.
Microsoft 365 Service HealthConfirm whether Microsoft has acknowledged an incident.
Halo PSA / ConnectWiseBulk-update affected customer tickets.
Microsoft TeamsInternal #noc channel and customer-shared channels.
StatusPage.ioPublic status page update.
Gmail / OutlookCustomer comms with technical contacts.

Walkthrough

1

Acknowledge the PRTG alarm

Copilot pulls the three sensors and their history. They started failing within 90 seconds of each other across three different customer probes — not a customer-side coincidence.
2

Rule out customer-network paths

Copilot runs a parallel diagnostic sweep: ping and HTTPS probe against outlook.office365.com, login.microsoftonline.com, and graph.microsoft.com from each customer probe via SSH. All three customers have a clean Internet path; Microsoft endpoints respond slowly or 5xx.
3

Check Microsoft Service Health

Copilot calls the Microsoft 365 Service Health connector. There is an acknowledged incident EX{number} for Exchange Online authentication, scope global. That settles the diagnosis.
4

Compose the customer message once

Copilot drafts a short customer-facing message: cause (Microsoft incident), scope (Exchange auth), what’s affected (Outlook, OWA), what isn’t (Teams chat, SharePoint), workaround (existing sessions still work), the Microsoft incident ID, and the next update time.
5

Bulk-update tickets in the PSA

The PSA connector lists every open ticket in the last 60 minutes that mentions Outlook, M365, or “email is slow.” Copilot stages a bulk update with the message, links the Microsoft incident, and pauses for approval. You scan the list, untick two unrelated tickets, approve.
6

Post in customer-shared Teams channels

For customers with a shared Teams channel, Copilot posts the same message tagged to the right contacts. The message sticks at the top of each channel for visibility.
7

Update the public status page

The StatusPage.io connector publishes a Monitoring incident pointing at the Microsoft outage and links the upstream Microsoft advisory.
8

Set a follow-up timer

Copilot adds a 30-minute follow-up reminder. When the timer fires, it re-checks Service Health, the PRTG sensors, and updates the same channels with progress or an all-clear.

Where Studio earns its keep

  • One diagnostic run touches every customer probe at once — no SSH-jumping between consoles to confirm a global pattern.
  • The same message reaches the PSA, Teams, and the status page with one approval, instead of forty manual posts.
  • The follow-up loop is automatic: the 30-minute check happens whether you remember it or not.
  • The all-clear closes every ticket and posts a final status without you composing it three times.

AI Copilot

Use Planning when the bulk update needs a careful review before it goes out.

Connectors and MCP

How PRTG, the PSA, Microsoft Service Health, Teams, and StatusPage.io are reachable.