• About
  • Subscribe
  • Contact
Thursday, May 8, 2025
    Login
FutureCISO
  • People
  • Process
  • Technology
  • Resources
    • White Papers
    • PodChats
No Result
View All Result
  • People
  • Process
  • Technology
  • Resources
    • White Papers
    • PodChats
No Result
View All Result
FutureCISO
No Result
View All Result
Home Process

Lessons from a global IT outage

Melinda Baylon by Melinda Baylon
August 12, 2024
Lessons from a global IT outage
Share on FacebookShare on Twitter

A defective software update from cybersecurity firm CrowdStrike for its Microsoft Windows hosts left hospitals, banks, airlines, and other businesses scrambling during a global mass IT outage weeks ago.

George Kurtz

CrowdStrike founder and CEO George Kurtz has since apologised to their customers and partners in a letter he penned in the aftermath of the outage, saying that nothing is more important to him than their “trust and confidence.”

The aftermath

In the aftermath of the outage, verified business and technology professionals participating in Gartner Peer Community polls and discussions revealed that the majority said the outage had some impact on their organisations, with 47% saying it had either a severe impact (14%) or a significant impact (33%) on their operations. In comparison, 20% said there was no impact.

To mitigate risks and manage the impact of the outage, organisations focused on employee support by assisting affected employees with alternative work arrangements and internal updates (59%). Almost half (48%) said their organisations focused on activating recovery procedures and guiding the help desk on increased calls, while 44% of respondents said their organisations focused on incident response by activating in-depth security investigations and response procedures.

Other actions organisations took are implementing additional security controls (30%), providing updates to customers and external stakeholders (28%), communicating directly with CrowdStrike (25%), seeking support from external consultants, vendors (23%), and service providers, and talking directly with Microsoft (16%).

Lessons from the global IT outage

While unable to confirm if this was the largest IT outage in history Nathan Wenzler, chief security strategist at Tenable, says the incident has left its mark on the global tech and business landscape, with an undeniable impact on an estimated 8.5 million Windows devices.

"An outage of this scale brings several lessons to the fore, and as we reflect a week later, we can distil some critical learnings from this event," declared Wenzler

Commenting on the news, Vishal Ghariwala, chief technology officer (CTO) for SUSE Asia Pacific, says other companies should also prepare for such incidents.

“What happened with Microsoft and CrowdStrike was a situation nobody wants to find themselves in. The incident showed how a seemingly simple IT update can paralyse global IT systems and our day-to-day lives. I don’t think this is about CrowdStrike or Windows. This can happen with any software. More importantly, it stresses the importance of incorporating digital resilience so that mission-critical IT services can withstand serious disruptions,” Ghariwala said.

Related:  CISO influence among SG enterprises rising

Reflecting on the lessons learned from the incident, the SUSE executive reminds organisations of some measures they could take to prevent or minimise disruptions caused by IT outages.

Vishal Ghariwala

“One approach is to reduce software concentration risks by adopting diverse IT stacks, especially for critical services. It may seem counterintuitive, but moving away from monolithic IT environments allows organisations to quickly switch mission-critical operations while they fix the issue at hand."Vishal Ghariwala

He also reminds organisations to consider adopting a dual operating system (OS) strategy so they can easily switch to the other when one is affected.

“This may not always be a cheap and easy architectural, technological, or business decision - especially if companies are trapped in a rigid IT ecosystem that inhibits agility and interoperability. However, in my honest opinion, this is something that every organisation should proactively consider so that their essential IT services can be preserved in the face of major disruptions.” Ghariwala said.

For Forrester VP principal analyst Andras Cser, the solution is attaching a physical keyboard to each affected system, booting into Safe Mode, removing the compromised CrowdStrike update, and rebooting.

Andras Cser

"Remediating this issue requires significant effort. Prior track records of similar incidents have shown that vendors’ operations, product testing, and communications strategies only get better after such incidents occur.” Andras Cser

The way forward 

Forrester recommends that tech and security leaders empower authorised system administrators to fix problems quickly, communicate internally and externally the impacts, status, and progress of remediation efforts, pay attention to the vendor’s communication strategies, follow official advice, and look after their people.  

Once the immediate issue is fixed, Forrester recommends implementing infrastructure automation, refreshing and rehearsing their IT outage response plan, and getting unified, written warranties from security vendors on their quality assurance processes and threat detection effectiveness. 

In the longer term, Forrester reminds organisations to reevaluate third-party risk strategy and approach and use the contract as a risk mitigation tool. 

Immediate, midterm, and long term

Based on Gartner’s analysis, organisations should organise action plans based on immediate, midterm, and long-term programs.

Related:  Cisco AI Assistant for Security unveiled

In the short term, Gartner recommends organisations avoid overreacting by immediately mandating the decommission, disablement, or replacement of affected software.

In terms of midterm actions, organisations could assess the impact of an IT outage on secondary systems, seek exposed vulnerabilities, and ensure visibility into systemwide updates and releases.

It is also wise to review anomalies or unusual trends to minimise the risks of undetected opportunistic attacks, participate in the business impact analysis, communicate with senior leadership on the current status of PCs, check agent automatic update settings, and ensure they are consistent with existing organisational change control policy, and manage burnout/fatigue in your team to minimise the risk of error.

In the long term, it is crucial to mitigate or reduce the risk of the same level of business impact or exposure by reviewing the prevention, response, and support procedures for large-scale outages, checking and updating downtime procedures for critical operations, and revising crisis communication plans, incident response processes and business continuity management/IT disaster recovery plans, and ensuring the competencies of critical employees with response and recovery responsibilities. It is also essential to assess the operational impacts before deploying a security agent.

Significant discussions

The CrowdStrike-Microsoft outage will be a springboard for further significant discussions and questions on risk mitigation, vigilance, testing, and cybersecurity.

Nathan Wenzler

“Many organisations will be asking the question about whether or not it will be worth allocating dedicated resources to ensure these sorts of matters are caught in the future by more stringent testing and QA processes, or if they take the risk that another outage like this won’t happen for several more years and the cost of the impact at that time will be less than the cost of the testing process.  Each organisation will need to determine the best course of action based on their respective needs and risk tolerance,” Tenable’s Wenzler said.

As organisations globally remain at risk, the lessons from the global IT outage can serve as a wake-up call for them to proactively implement short-, midterm, and long-term measures to strengthen their cybersecurity posture and minimize disruptions caused by large-scale IT outages.

Tags: CrowdStrikeGartnerIT outageSUSETenable
Melinda Baylon

Melinda Baylon

Melinda Baylon joins Cxociety as editor for FutureCIO and FutureIoT. As editor, she will be the main editorial contact for communications professionals looking to engage with aforementioned media titles. 

Melinda has adecade-long career in the media industry and served as TV reporter for ABS-CBN and IBC 13. She also worked as a researcher for GMA-7 and a news reader for Far East Broadcasting Company Philippines. 

Prior to working for Cxociety, she worked for a local government unit as a public information officer. She now ventures into the world of finance and technology writing while pursuing her passions in poetry, public speaking and content creation. 

Based in the Philippines, she can be reached at [email protected]

No Result
View All Result

Recent Posts

  • DDoS attacks surge in Asia Pacific, claims Cloudflare
  • Reimagining security for the AI Era
  • PodChats for FutureCISO: Articulating the business value of security in 2025
  • New standard for cybersecurity at the storage layer
  • Cybersecurity challenges persist despite improved defenses

Categories

  • Blogs
  • Compliance and Governance
  • Culture and Behaviour
  • Cybersecurity careers
  • Data Protection
  • Endpoint Security
  • Incident Response
  • Network Security
  • People
  • Process
  • Resources
  • Risk Management
  • Technology
  • Training and awarenes
  • Videos
  • Webinars and PodChats
  • White Papers

Strategic Insights for Chief Information Officers

FutureCISO serves the interests of the Chief Information Security Officer (CISO) and the information security profession. Its purpose is to provide relevant and timely industry insights around all things important to security professionals and organisations that recognize and value the importance of protecting the organisation’s data and its customers’ privacy.

Cxociety Media Brands

  • FutureIoT
  • FutureCFO
  • FutureCIO

Categories

  • Privacy Policy
  • Terms of Use
  • Cookie Policy

Copyright © 2024 Cxociety Pte Ltd | Designed by Pixl

Login to your account below

or

Not a member yet? Register here

Forgotten Password?

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • People
  • Process
  • Technology
  • Resources
    • White Papers
    • PodChats
Login

Copyright © 2024 Cxociety Pte Ltd | Designed by Pixl