Cloud technologies are a rapidly evolving landscape. Securing cloud applications is everyone’s responsibility, meaning application development teams are needed to follow strict security guidelines from the earliest development stages, and to make sure of continuous security scans throughout the whole application lifecycle. The rise of generative AI enables new innovative approaches for addressing longstanding challenges with reduced effort.
This post showcases how engineering teams can automate efficient remediation of container CVEs (common vulnerabilities and exposures) early in their continuous integration (CI) pipeline. Using cloud services such as Amazon Bedrock, Amazon Inspector, AWS Lambda, and Amazon EventBridge you can architect an event-driven serverless solution for automatically addressing container vulnerabilities detection and patching. Using the power of generative AI and serverless technologies can help simplify what used to be a complex challenge.
Overview
The exponential growth of modern applications has enabled developers to build highly decoupled microservice-based architectures. However, the distributed nature of those architectures comes with a set of operational challenges. Engineering teams were always responsible for various security aspects of their application environments, such as network security, IAM permissions, TLS certificates, and code vulnerability scanning. Addressing these aspects at the scale of dozens and hundreds of microservices requires a high degree of automation. Automation is imperative for efficient scaling as well as maintaining control and governance.
Running applications in containers is a common approach for building microservices. It allows developers to have the same CI pipeline for their applications, regardless of whether they use Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Elastic Container Service (Amazon ECS), or AWS Lambda to run it. No matter which programming language you use for your application, the deployable artifact is a container image that commonly includes application code and its dependencies. It is imperative for application development teams to scan those images for vulnerabilities to make sure of their safety prior to deploying them to cloud environments. Amazon Elastic Container Registry (Amazon ECR) is an OCI artifactory that provides two types of scanning, Basic and Enhanced, powered by the Amazon Inspector. The image scanning occurs after the container image is pushed to the registry. The basic scanning is triggered automatically when a new image is pushed, while the enhanced scanning runs continuously for images hosted in Amazon ECR. Both types of scans generate scan reports, but it is still the development team’s responsibility to act on it: read the report, understand the vulnerabilities, patch code, open a pull request, merge, and run CI again. The following steps illustrate how you can build an automated solution that uses the power of generative AI and event-driven serverless architectures to automate this process.
The following sample solution uses the “in-context learning” approach, a technique that tailors AI responses to narrow scenarios. Used for CVE patching, the solution builds AI prompts based on the programming language in question and a previously generated example of what a PR might look like. This approach underscores a crucial point: for some narrow use cases, using a smaller Large Language Model (LLM), such as Llama 13B, with assisted prompt might yield equally effective results as a bigger LLM, such as Llama 2 70B. We recommend that you evaluate both few-shot prompts with smaller LLMs and zero-shot prompts with larger LLMs to find the model that works most efficiently for you. Read more about providing prompts and examples in the Amazon Bedrock documentation.
Solution architecture
Prior to packaging the application as a container, engineering teams should make sure that their CI pipeline includes steps such as static code scanning with tools such as SonarQube or Amazon CodeGuru, and image analysis tools such as Trivy or Docker Scout. Validating your code for vulnerabilities at this stage aligns with the shift-left mentality, and engineers should be able to detect and address potential threats in their code in the earliest stages of development.
After packaging the new application code and pushing it to Amazon ECR, the image scanning with Amazon Inspector is triggered. Engineers can use languages supported by Amazon Inspector. As image scanning runs, Amazon Inspector emits EventBridge Finding events for each vulnerability detected.
- CI is triggered by a developer pushing new code to the shared code repository. This step is not implemented in the provided sample, and different engineering teams can use different tools for their CI pipeline.
- The application container image is built and pushed to the Amazon ECR.
- Amazon Inspector is triggered automatically. Note that you must first enable Amazon Inspector ECR enhanced scanning in your account.
- As Amazon Inspector scans the image, it emits findings in a format of events to EventBridge. Each finding generates a separate event. See the example JSON payload of a finding event in the Inspector documentation.
- EventBridge is configured to invoke a Lambda function for each finding event.
- Lambda is invoked for each finding. The function aggregates and updates the Amazon DynamoDB database table with each finding information.
- Once Amazon Inspector completes the scan, it emits the scan complete event to EventBridge, which calls the PR creation microservice hosted as an Amazon ECS Fargate Task to start the PR generation process.
- PR creation microservice clones the code repo to see the current dependencies list. Then it retrieves the aggregated findings data from DynamoDB, builds a prompt using the dependencies list, findings data, and in-context learning example based on previous scans. The microservice invokes Amazon Bedrock to generate a new PR content.
- Once the PR content is generated, the microservice opens a new PR and pushes changes upstream.
- Engineering teams validate the PR and merge it with code repository. Overtime, as engineering teams gain trust with the process, they might consider automating the merge part as well.
Sample implementation
Use the example project to replicate this solution in your AWS account. Follow the instructions in README.md for provisioning and testing the sample project using Hashicorp Terraform.
Under the /apps directory of the sample project you should see two applications. The /apps/my-awesome-application intentionally contains a set of vulnerable dependencies. This application was used to create examples of what a PR should look like. Once the engineering team took this application through Amazon Inspector and Amazon Bedrock manually, a file containing this example was generated. See in_context_examples.py. Although it can be a one-time manual process, engineering teams can also periodically add more examples as they evolve and improve the generative AI model response.
The /apps/my-amazing-application is the actual application that the engineering team works on delivering business value. They deploy this application several times a day to multiple environments, and they want to make sure that it doesn’t have vulnerabilities. Based on the in-context example created previously, they’re continuously using Amazon Inspector to detect new vulnerabilities, as well as Amazon Bedrock to automatically generate pull requests that patch those vulnerabilities.
The following example shows a pull request generated when a member of the development team has introduced vulnerable dependencies. The pull request contains details about the packages with detected vulnerabilities and CVEs, as well as recommendations for how to patch them.
Moreover, the pull request already contains an updated version of the requirements.txt file with the changes in place. The only thing left for the engineering team to do is review and merge the pull request.
Conclusion
This post illustrates a simple solution to address container image (OCI) vulnerabilities using AWS Services such as Amazon Inspector, Amazon ECR, Amazon Bedrock, Amazon EventBridge, AWS Lambda, and Amazon Fargate. The serverless and event-driven nature of this solution helps make sure of cost efficiency and minimal operational overhead. Engineering teams do not need to run additional infrastructure to implement this solution.
Using generative AI and serverless technologies helps simplify what used to be a complex and laborious process. Having an automated workflow in place allows engineering teams to focus on delivering business value, thereby improving overall security posture without extra operational overhead.
Checkout step-by-step deployment instructions and sample code for the solution discussed in the post in this GitHub repository.
References
https://aws.amazon.com/blogs/aws/amazon-bedrock-now-provides-access-to-llama-2-chat-13b-model/
https://docs.aws.amazon.com/bedrock/latest/userguide/general-guidelines-for-bedrock-users.html