When facing a game-changing technology like generative AI, experimenting is a great way to learn what it can do and how to use it. And indeed, we at AWS see plenty of companies doing precisely that, becoming more sophisticated and able to ask more informed questions about where the technology is heading and what it might mean for them. But we also see many of them stuck in a proof-of-concept phase, never quite getting to production even though their proofs of concept seem to be successful.
It might be because they sense risk. That’s not surprising; the media is filled with stories of AI hallucinations, toxic speech, bias, and inaccuracy. Deploying new technologies always carries risks; leaders must manage these risks to an acceptable level. And even hiring and deploying a human being carries some of these risks—they too might demonstrate bias or provide inaccurate information. As with anything a business does, deploying generative AI applications is a matter of mitigating risks until they are outweighed by the benefits of the new capabilities. There are ways to manage AI risks, and more are certainly on their way.
Some leaders are also savvy enough to worry about costs further into the future when their generative AI applications are used at scale. But one of the goals of a POC should be to get a feel for the likely costs. And these costs will probably go down over time as foundation models evolve, as providers compete, and as businesses are offered a choice of models that have different price/performance characteristics.
But I wonder if this is not really about risk. Perhaps the real issue is that many companies are not actually committed to deploying the POC applications into production. Without obsessing over terminology (e.g., experiment, proof of concept, pilot, etc.), it’s important to note that a proof of concept aims to reduce risk and learn about an application the business intends to deploy. Enterprises typically identify a business objective, plan to use a technology to achieve it, note the risks or challenges involved, and then design a proof of concept to mitigate the risks and challenges before committing to a full investment. There is a clear definition of success: mitigate the risks you fear, learn what will be involved in implementation, or accomplish whatever else the proof of concept intends to accomplish. The proof of concept is a step toward deploying technology to meet a worthwhile business goal. True, the deployment may be canceled if the proof of concept shows it’s unachievable or too risky. But the process starts with an intention to use the technology because the business goal is considered worth solving.
Contrast that with the many generative AI experiments today. A company identifies 100 possible use cases and tries a foundation model to see how it could fulfill each one. This is a great approach initially for learning the technology and inspiring ideas for how to use it. Companies should experiment. But it’s not always a good path to production.
For one thing, this method only tests the foundation model, its current state, and its prompts and integrations. It doesn’t test the business case. Second, there’s no clear definition of success; since the company didn’t begin with the intention to deploy and didn’t identify the particular risks it needed to mitigate in order to get there, the result of the proof of concept can only be “That’s really cool!” Third, the proof of concept doesn’t systematically mitigate the risks that will be raised as concerns when it’s time to deploy. And fourth, the resources aren’t in place to get it to production: a business case will later have to be made to obtain them. At best the prototype has shown that an application can do something relevant in a use case; but that is a far cry from proving out a business case.
Now that we’ve all had a chance to play with generative AI and conduct experiments to learn more, it’s time to focus on getting value from it. As with other technologies we’ve deployed in the past, that’s a matter of finding important business objectives that can be met with it, outlining a business case, managing risks, securing resources, and going to production. It’s not a matter of experimenting with use cases; it’s a matter of designing a solution to a business problem that matters and moving toward solving it. That mental model leads naturally to production.
On the way you’ll realize that production-grade generative AI applications require production-grade security, privacy protection, compliance, agility, cost management, operational support, and resilience. Most generative AI applications need to be integrated with other enterprise applications, connected to enterprise data sources, and controlled by enterprise guardrails.
A true proof of concept (as opposed to a learning experiment) includes a path to deployment with all enterprise features. You’ll want to add to the proof of concept, test it in real-world situations, apply your enterprise security model, and do all the other things we do and have always done in enterprise IT.
This aligns with AWS’s vision for generative AI (and classical machine learning and future technologies). What matters is how a technology can help AWS customers accomplish business, mission, or social objectives—not the technology itself. We design and build our AI tools from the ground up to meet our exacting standards for security and reliability and to fit within existing enterprise frameworks for compliance, guardrails, operability, and data management. They are designed for agility: for example, Amazon Bedrock offers access to many foundation models through a single API, making it easy to take advantage of new ones as they evolve. Anthropic’s Claude 3, the most successful model today by industry benchmarks, is available through that API, as are other models that offer different tradeoffs on price, speed, and accuracy. AWS has always envisioned generative AI taking its place in the enterprise’s broader technology estate.
If you’re serious about using generative AI to meet a known and important business objective, think of it as functionality that is on a path to production and value creation. Proofs of concept are an important way to manage risks and validate your business case—not your case for the technology itself but for the business functionality you create with it.
You will never reduce your risks to zero because there are always risks to deploying something new (even, as I said before, human employees). But you can work to mitigate these risks to an acceptable level and operate within the guardrails of responsible AI. AWS has done the heavy lifting to help you mitigate risks: it is architected to be the world’s most secure and reliable infrastructure. The risk management framework you use for your other IT systems carries forward into the new generative AI applications you deploy. The path to production is open.