• About
  • Subscribe
  • Contact
Thursday, May 8, 2025
    Login
FutureCISO
  • People
  • Process
  • Technology
  • Resources
    • White Papers
    • PodChats
No Result
View All Result
  • People
  • Process
  • Technology
  • Resources
    • White Papers
    • PodChats
No Result
View All Result
FutureCISO
No Result
View All Result
Home Process Risk Management

Open source tool to safeguard against AI model jailbreaks

FutureCISO Editors by FutureCISO Editors
December 20, 2024
Cisco APAC partners diverting focus to security

Photo by Christina Morillo: https://www.pexels.com/photo/two-women-looking-at-the-code-at-laptop-1181263/

Share on FacebookShare on Twitter

As artificial intelligence (AI) models continue to revolutionise various industries through enhanced customer interactions and automation, they simultaneously introduce new security challenges that many organisations are ill-equipped to handle. According to IBM, AI jailbreaks occur when hackers exploit vulnerabilities in AI systems to bypass their ethical guidelines and perform restricted actions. They use common AI jailbreak techniques, such as prompt injections and roleplay scenarios.

Put another way, jailbreaking attacks can be used to alter model behaviour and benefit the attacker. If not properly controlled, business entities can face fines, reputational harm, and other legal consequences.

In 2023, researchers at Carnagie Mellon University, The Center for AI Safety, and the Bosch Center for AI, claim to have discovered a simple prompt addendum that allowed the researchers to trick models into generating biased, false, and otherwise information. The attacks were shown to be effective in ChatGPT, Google Bard, Meta's LLaMA, Anthropic's Claude, and other open source products.

CyberArk claims that its FuzzyAI has successfully jailbroken every major AI model tested, providing a vital tool for identifying and mitigating risks associated with guardrail bypassing and harmful output generation in both cloud-hosted and in-house AI systems.

At the core of FuzzyAI is a robust fuzzer—a tool designed to reveal software defects and vulnerabilities—capable of employing over ten distinct attack techniques. These techniques include methods for bypassing ethical filters and exposing hidden prompts within the systems. Key features of FuzzyAI include comprehensive fuzzing capabilities that probe AI models to identify vulnerabilities such as guardrail bypassing, information leakage, and harmful output generation.

Related:  New identity security platform boosts cybersecurity

Peretz Regev, chief product officer at CyberArk, stated, “The launch of FuzzyAI underlines CyberArk’s commitment to AI security and helps organisations take a significant step forward in addressing the security issues inherent in the evolving landscape of AI model usage. FuzzyAI has demonstrated the ability to jailbreak every major tested AI model. This empowers organisations and researchers to identify weaknesses and actively fortify their AI systems against emerging threats.”

FuzzyAI also features an extensible framework, allowing organisations and researchers to add their own attack methods tailored to specific domain vulnerabilities. This flexibility, combined with a community-driven ecosystem, ensures that FuzzyAI evolves alongside emerging adversarial techniques and defence mechanisms.

As the significance of AI security continues to grow, CyberArk's FuzzyAI represents an advancement for organisations seeking to enhance their resilience against sophisticated AI threats, ultimately contributing to safer AI development and deployment.

Tags: AI jailbreakCyberArk
FutureCISO Editors

FutureCISO Editors

No Result
View All Result

Recent Posts

  • DDoS attacks surge in Asia Pacific, claims Cloudflare
  • Reimagining security for the AI Era
  • PodChats for FutureCISO: Articulating the business value of security in 2025
  • New standard for cybersecurity at the storage layer
  • Cybersecurity challenges persist despite improved defenses

Categories

  • Blogs
  • Compliance and Governance
  • Culture and Behaviour
  • Cybersecurity careers
  • Data Protection
  • Endpoint Security
  • Incident Response
  • Network Security
  • People
  • Process
  • Resources
  • Risk Management
  • Technology
  • Training and awarenes
  • Videos
  • Webinars and PodChats
  • White Papers

Strategic Insights for Chief Information Officers

FutureCISO serves the interests of the Chief Information Security Officer (CISO) and the information security profession. Its purpose is to provide relevant and timely industry insights around all things important to security professionals and organisations that recognize and value the importance of protecting the organisation’s data and its customers’ privacy.

Cxociety Media Brands

  • FutureIoT
  • FutureCFO
  • FutureCIO

Categories

  • Privacy Policy
  • Terms of Use
  • Cookie Policy

Copyright © 2024 Cxociety Pte Ltd | Designed by Pixl

Login to your account below

or

Not a member yet? Register here

Forgotten Password?

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • People
  • Process
  • Technology
  • Resources
    • White Papers
    • PodChats
Login

Copyright © 2024 Cxociety Pte Ltd | Designed by Pixl