October and November 2025 were not great months for DNS.
October 20th: AWS had a DNS meltdown that took out chunks of the internet for 15 hours.
October 29th: Microsoft Azure followed with their own DNS incident impacting Azure and Microsoft 365 services for about two hours.
November 18th: Cloudflare experienced what they called “an internal service degradation” that took down X (Twitter), ChatGPT, Spotify, and hundreds of other services (including DownDetector, ironically) for roughly 3 hours. Their CTO publicly apologized, saying “we failed our customers and the broader Internet.”
Three major infrastructure providers. Four weeks. DNS problems every time.
IT teams across the government kept watching the same frustrating pattern unfold: their applications were perfectly healthy. Servers humming. Databases responding. Security controls green across the board.
But nobody could reach them.
The infrastructure was fine. The DNS (that invisible phone book that tells the internet where everything lives) had forgotten how to answer the phone.
After spending two decades in government and now working with agencies from the outside, I’m writing this article because I think we need to talk about DNS. Not in a “buy our solution” way, but in a “hey, there’s a free government service that could save you from this exact problem and not everyone knows about it” way.
The Pattern Nobody Talks About
Let me tell you what prompted this whole thing. I recently looked at the DNS setup for a couple of agencies, GSA and Commerce, since their records are public. What I found was…educational.
GSA‘s DNS showed a pattern I suspect is common: multiple nameservers that all trace back to the same cloud provider infrastructure. It looks redundant on paper. “See? Five nameservers!” But when that provider has a control plane issue (not even a full regional outage, just a configuration hiccup) all five nameservers become equally useless. The domain just vanishes from the internet until someone at the cloud provider fixes their automation.
Commerce.gov told a different story. They’re entirely on Cloudflare’s Anycast network; still technically single-provider from a contract perspective but built on infrastructure that spans 300+ global locations. To take that down, you’d need a company-wide catastrophe, not just a regional glitch. It’s a fundamentally different risk profile.
Right on queue, Cloudflare had exactly that kind of company-wide incident. Three hours of disruption. “Internal service degradation” that hit their entire platform.
Here’s the thing: Commerce.gov probably came through fine because they’re using Cloudflare for authoritative DNS (the “where is commerce.gov” part), not for application hosting. But hundreds of companies that were fully dependent on Cloudflare – for hosting, Content Delivery Network (CDN), security, everything – went dark.
The lesson isn’t “Cloudflare is unreliable.” It’s that no single provider, no matter how well-architected, is immune to failure. AWS had 15 hours down. Azure had 2 hours. Cloudflare had 3 hours. All within four weeks.
While I only took the time to look at these two, I’d bet good money it’s a mixed bag across the federal enterprise. Some agencies are in great shape. Others are one bad automation script away from disappearing from the internet.
The problem? You can’t tell from the outside. They all look fine until something breaks.
“But Doesn’t DNS Cache?”
Every time I bring this up, someone inevitably says, “Wait, doesn’t DNS cache? Won’t that keep us online?”
Sort of. DNS caching is like having battery backup on your smoke detectors. It’s great for short interruptions, but it’s not a solution to the underlying problem.
Here’s why caching won’t save you:
- New connections fail immediately. That user trying to access your portal for the first time? They don’t have anything cached, so they will only get an error.
- Backend systems don’t cache like browsers. Your identity provider checking certificate validity? Your email server verifying DMARC records? Your API validating tokens? They all need fresh DNS lookups. When authoritative DNS is down, all of that breaks.
- The clock always runs out. Whether your time-to-live (TTL) is 5 minutes or an hour, eventually those cached records expire. And when they do, if your authoritative DNS can’t answer, everyone’s in the same boat as the new users.
Caching buys you time, but it’s not a fix.
Here’s The Good News: The Government Already Built the Solution
This is the part where I tell you something that might surprise you. The federal government already solved the biggest piece of this problem, and they’re giving it away for free.
CISA’s Protective DNS service has been available to federal civilian agencies since September 2022. As of now, over 110 agencies have been onboarded. It processes more than 3 billion DNS queries per day with a 99.999% uptime rate. Let me say that again: 99.999% uptime.
Protective DNS Resolver supports:
- Traditional on-premises networks
- Cloud infrastructure
- Mobile, roaming, and nomadic devices
- Encrypted DNS (DoH and DoT)
- IPv4 and IPv6
It blocks queries to known malicious destinations using threat intel from commercial feeds, government sources, and industry partners. Agencies receive real-time alerts via API whenever something suspicious occurs. It also provides full DNS traffic logs and dashboards, giving you full visibility into what’s hitting your network.
And again, because this is important: it’s free to federal civilian agencies.
So Why Isn’t Everyone Using It Yet?
That’s the question, isn’t it?
OMB M-22-09 (the Zero Trust memo) and CISA’s Encrypted DNS Implementation Guidance aren’t suggestions. They’re mandates. Agencies are required to:
- Route DNS queries through CISA Protective DNS
- Encrypt DNS traffic where technically feasible
- Block unauthorized DNS egress at the firewall
- Implement DNSSEC validation
- Enforce DMARC policies on email domains
The September 2024 deadline came and went. Agencies were “nearing implementation” heading into that deadline, with the understanding that Zero Trust is a journey, not a destination.
But 110 agencies out of…how many federal civilian agencies are there? Depending on how you count, somewhere between 100 and 430. The point is, there are still agencies that haven’t onboarded to CISA’s Protective DNS service.
Maybe they’re buried in procurement paperwork (even though it’s free, government’s gonna government). Maybe they’re stuck with legacy systems that need modernization first. Or maybe they just haven’t prioritized it, because after all, DNS has always “just worked”…until October 2025, when it very publicly didn’t.
What Should Agencies Actually Be Doing?
If you’re reading this and realizing your agency isn’t using CISA Protective DNS yet, or you’re only partially implementing it, here’s my unsolicited advice from someone who just spent 21 years navigating government IT:
- Start with the CISA onboarding. Contact CISA’s Cybersecurity Shared Services Office at CyberSharedServ@cisa.dhs.gov. They have a whole team dedicated to helping agencies through this. It’s literally their job, and the service is free.
- Do an honest assessment of your authoritative DNS. Pull up your domain records and check where your nameservers are actually hosted. If they’re all with one provider, understand the risk that you are accepting. Not saying you need to fix it tomorrow, but you should at least know what breaks if that provider has a bad day.
- Map your encrypted DNS capabilities. Can your endpoints handle DoH/DoT (the two main encryption protocols for sending DNS requests)? What about your cloud workloads? Your IoT devices? This gets complicated quickly, which is why CISA published implementation guidance to walk through different scenarios.
- Get your firewall rules in order. If devices can still bypass your DNS controls and query 8.8.8.8 or 1.1.1.1 directly, you’re not getting the protection Protective DNS offers. Block unauthorized DNS at the perimeter.
- Don’t sleep on DNSSEC, SPF, DKIM, and DMARC. These aren’t separate security and authentication projects; they’re part of the same DNS security posture you’re building. DNSSEC validation at the resolver, DMARC at p=reject for your email domains.
The October outages should be a wake-up call, not because I’m trying to sell you something, but because DNS fragility is real and the next incident could be the one that hits your agency.
–––
I work with Arctic IT Government Solutions (part of Doyon Technology Group), helping federal agencies navigate the technical and compliance challenges of modernizing critical infrastructure. But more importantly, I’m a former government IT person who wants to see agencies succeed at this stuff.
Questions? Want to talk through your specific situation? Reach out and let’s have a conversation.
P.S. If you found this helpful, please share it with your colleagues. The more agencies that know about CISA Protective DNS, the fewer agencies get caught in the next outage.

By Robin Zickgraf, Account Executive at Arctic IT Government Solutions

