Those who spend a lot of time online are asking plenty of questions after a massive Internet outage caused disruptions around the world Friday.
Ontario-based technology analyst Carmi Levy says it was a faulty software update from a cybersecurity company called CrowdStrike that triggered the outage. They provide a product called Falcon Sensor and had rolled out an update for that on Thursday night.
“Unfortunately they didn’t test it as well as they probably could have and they ended up with a critical error in it and it caused the Windows computers it was installed on to crash.”
They tried to reboot themselves but couldn’t, he said, leading to the dreaded “blue screen of death.”
Read more
- Buffering a thing of the past for some Sask. communities
- SaskTel reaches halfway mark of 5G rollout in Sask.
The problem impacted systems that relied on Microsoft Windows computers including for banks, airlines, border crossing services, and health care among many more. Levy said the outage likely impacted over a billion people.
He says the “good news here — if there is any good news in all of this — is that this is not a cyberattack. This was not a criminal act in any way, shape or form. This was simply an accident where software developers worked to release an update on their software… unfortunately there was an error in the code that did not get caught.”
It’s not the first time a massive Internet outage has caused problems, with Levy pointing to the Rogers outage that Canadians experienced in 2022.
“It really does illustrate the risks that we run in today’s cloud-driven, digital economy,” said Levy. “Most of these services, they’re incredibly reliable for the most part. But when they are not available, when there’s a problem with them, the impact is incredibly widespread.”
Levy says there is still plenty that isn’t known yet about how Friday’s outage could have happened.
“Certainly there are more questions now than there are answers,” said Levy.
“But there will be an investigation, if not many investigations, and there will be a report that will help us better understand what happened. Clearly, CrowdStrike really needs to explain how it allowed a faulty update to get out of testing and into production.”
Levy said questions also need to be asked about what happened to “all the fail-safes, the backups, the additional protections that are designed to keep systems from failing when errors like this get introduced into the environment. There are a lot of things that contributed to this particular event, a cascading series of problems.”
He expects the investigative process to be similar to what happens after a plane crash.
So how can people be prepared for the next time an outage like this happens?
Levy says have “a workaround, a plan B in case these services are unavailable,” with a plan to use low-tech solutions if an Internet outage knocks you offline.
Read more