As many of us grow accustomed to using artificial intelligence tools daily, it's worth remembering to keep our questioning hats on. Nothing is completely safe and free from security vulnerabilities. Still, companies behind many of the most popular generative AI tools are constantly updating their safety measures to prevent the generation and proliferation of inaccurate and harmful content.
In a research paper to examine the vulnerability of large language models (LLMs) to automated adversarial attacks, the authors demonstrated that even if a model is said to be resistant to attacks, it can still be tricked into bypassing content filters and providing harmful information, misinformation, and hate speech. This makes these models vulnerable, potentially leading to the misuse of AI.
"This shows -- very clearly -- the brittleness of the defenses we are building into these systems," Aviv Ovadya, a researcher at the Berkman Klein Center for Internet & Society at Harvard, told The New York Times.
View Full Article
No entries found