Source: Vitalii Vodolazskyi via Shutterstock
In a net positive for researchers testing the security and safety of AI systems and models, the US Library of Congress ruled that certain types of offensive activities — such as prompt injection and bypassing rate limits — do not violate the Digital Millennium Copyright Act (DMCA), a law used in the past by software companies to push back against unwanted security research.
The Library of Congress, however, declined to create an exemption for security researchers under the fair use provisions of the law, arguing that an exemption would not be enough to provide security researchers safe haven.
Overall, the triennial update to the legal framework around digital copyright works in the security researchers' favor, as does having clearer guidelines on what is permitted, says Casey Ellis, founder and adviser to crowdsourced penetration testing service BugCrowd.
"Clarification around this type of thing — and just making sure that security researchers are operating in as favorable and as clear an environment as possible — that's an important thing to maintain, regardless of the technology," he says. "Otherwise, you end up in a position where the folks who own the [large language models], or the folks that deploy them, they're the ones that end up with all the power to basically control whether or not security research is happening in the first place, and that nets out to a bad security outcome for the user."
Security researchers have increasingly gained hard-won protections against prosecution and lawsuits for conducting legitimate research. In 2022, for example, the US Department of Justice stated that its prosecutors would not charge security researchers with violating the Computer Fraud and Abuse Act (CFAA) if they did not cause harm and pursued the research in good faith. Companies that sue researchers are regularly shamed, and groups such as the Security Legal Research Fund and the Hacking Policy Council provide additional resources and defenses to security researchers pressured by large companies.
In a post to its site, the Center for Cybersecurity Policy and Law called the clarifications by the US Copyright Office "a partial win" for security researchers — providing more clarity but not safe harbor. The Copyright Office is organized under the Library of Congress's purview.
"The gap in legal protection for AI research was confirmed by law enforcement and regulatory agencies such as the Copyright Office and the Department of Justice, yet good faith AI research continues to lack a clear legal safe harbor," the group stated. "Other AI trustworthiness research techniques may still risk liability under DMCA Section 1201, as well as other anti-hacking laws such as the Computer Fraud and Abuse Act."
Brave New Legal World
The fast adoption of generative AI systems and algorithms based on big data have become a major disruptor in the information-technology sector. Given that many large language models (LLMs) are based on mass ingestion of copyrighted information, the legal framework for AI systems started off on a weak footing.
For researchers, past experience provides chilling examples of what could go wrong, says BugCrowd's Ellis.
"Given the fact that it's such a new space — and some of the boundaries are a lot fuzzier than they are in traditional IT — a lack of clarity basically always converts to a chilling effect," he says. "For folks that are mindful of this, and a lot of security researchers are pretty mindful of making sure they don't break the law as they do their work, it has resulted in a bunch of questions coming out of the community."
The Center for Cybersecurity Policy and Law and the Hacking Policy Council proposed that red teaming and penetration testing for the purpose of testing AI security and safety be exempted from the DMCA, but the Librarian of Congress recommended denying the proposed exemption.
The Copyright Office "acknowledges the importance of AI trustworthiness research as a policy matter and notes that Congress and other agencies may be best positioned to act on this emerging issue," the Register entry stated, adding that "the adverse effects identified by proponents arise from third-party control of online platforms rather than the operation of section 1201, so that an exemption would not ameliorate their concerns."
No Going Back
With major companies investing massive sums in training the next AI models, security researchers could find themselves targeted by some pretty deep pockets. Luckily, the security community has established fairly well-defined practices for handling vulnerabilities, says BugCrowd's Ellis.
"The idea of security research being being a good thing — that's now kind of common enough ... so that the first instinct of folks deploying a new technology is not to have a massive blow up in the same way we have in the past," he says. "Cease and desist letters and [other communications] that have gone back and forth a lot more quietly, and the volume has been kind of fairly low."
In many ways, penetration testers and red teams are focused on the wrong problems. The biggest challenge right now is overcoming the hype and disinformation about AI capabilities and safety, says Gary McGraw, founder of the Berryville Institute of Machine Learning (BIML), and a software security specialist. Red teaming aims to find problems, not be a proactive approach to security, he says.
"As designed today, ML systems have flaws that can be exposed by hacking but not fixed by hacking," he says.
Companies should be focused on finding ways to produce LLMs that do not fail in presenting facts — that is, "hallucinate" — or are vulnerable to prompt injection, says McGraw.
"We are not going to red team or pen test our way to AI trustworthiness — the real way to secure ML is at the design level with a strong focus on training data, representation, and evaluation," he says. "Pen testing has high sex appeal but limited effectiveness."