Using AI to improve security and learning in your AWS environment.

Using AI to improve security and learning in your AWS environment.

Why We’re Building a Chatbot: Empowering Our Platform Team

Our Platform engineers are the backbone of our secure development process. However, they often face hurdles that slow them down and hinder their ability to deliver top-notch work. On top of this, there are different levels of expertise within the team and consulting all the available documentation can be a daunting and time consuming task. This chatbot could potentially reduce the coaching workload from the senior members to the junior members of the team.

To address these challenges and empower the teams, I chose to build a chatbot specifically designed to assist them.

Our team’s stack is composed of Airflow and Spark mainly, as well as the infrastructure hosted in AWS.

Here’s a closer look at the pain points we’re aiming to solve:

Streamlining Security Testing: Finding potential security vulnerabilities in code can be a time-consuming task. The chatbot will help engineers identify issues quickly and efficiently, it can trigger automated security scans through Spark or Airflow jobs, analyze scan results from S3 buckets, and present findings to engineers in a user-friendly way. This has become even more prominent and needed now with GDPR regulations. It can become a good tool for detecting PII data especifically tailored to our environment, if for whatever reason, Amazon Macie is not on the cards.
Enhancing Security Visibility: the team needs a clear picture of potential software vulnerabilities across all applications, like out of support versions. The chatbot can leverage data from various sources, like security reports stored in S3 buckets, and use this information to identify trends and potential vulnerabilities across applications managed by your EC2 instances within VPCs.
Heavy teaching workload: having an easy accesible but secure chatbot to consult all the platform’s documentation, fully tailored to our environment, will positively impact everyone in the team. Junior members will become more independent and confident in what they do and senior members will have more time to spend working on their own tickets.
Simplifying Security Practices: Developing strong threat models and security policies is crucial, but it can be complex. The chatbot will offer guidance and best practices, making it easier for engineers to implement these essential safeguards.
Boosting Infrastructure Security: Infrastructure as Code (IaC) plays a vital role in our development process.The chatbot can leverage tools like CloudFormation or Terraform to identify potential security risks within IaC templates before deployment to EC2 instances.
Enforcing Security Pipelines: Integrating security checks seamlessly into our development pipelines is essential. The chatbot will help enforce these checks, guaranteeing that security is never an afterthought by triggering security scans within Airflow pipelines, ensuring vulnerabilities are identified and addressed before code is deployed to production instances behind your ELBs.
Ensuring Quality Control: We are committed to delivering high-quality solutions. The chatbot will provide us with valuable insights and data, enabling us to maintain a high level of control over the development process.

By building this chatbot, we’re investing in building secure and reliable software, as well as promoting a continuous learning culture and self development.

Here’s a detailed breakdown of the steps I followed to build our chatbot:

Step 0:

Use the following CloudFormation template to deploy all the resources.

Step 1: Adding Documents to Amazon Simple Storage Service (S3)

What it is: Amazon S3 is a secure, scalable object storage service that acts as the foundation for our chatbot’s knowledge base.
The Why: You’ll store essential documents like security best practices, threat model templates, and IaC security guidelines in S3. This documents will be specific to your own production environment, so the answers can be fully tailored to our requirements. I personally chose to feed it with Spark and Airflow documentation, as well as general security docs.
How to do it:

Create an S3 bucket: This acts as a virtual folder where you’ll store your documents.
Upload relevant documents: Upload security policies, best practice guides, and any other resources you might need.
Configure access permissions: Ensure the chatbot has the necessary permissions to access and retrieve information from the S3 bucket.

Step 2: Searching with Amazon Kendra

What it is: Amazon Kendra is an intelligent search service that allows users to easily find relevant information across various sources like S3 buckets.
The Why: Kendra will be crucial for your chatbot to efficiently search the vast amount of security information stored in S3.
How to do it:

Create a Kendra index: This index tells Kendra where to look for information, in this case, your S3 bucket containing security documents.

2. Add an s3 connector and link it to the s3 bucket which contains the data.

3. Sync now.

Step 3: Setting Up Access to Amazon Bedrock

What it is: Amazon Bedrock is a security configuration language that helps enforce security best practices throughout your infrastructure.
The Why: The chatbot can leverage Bedrock (AI) to guide chatbot users (our engineers) in developing secure IaC configurations and identify potential security risks in their code.
How to do it:

You will need access to Anthrotopic (Claude) and Titan Express. If you don’t have it granted yet, you’ll need to request it now.

Step 4: Using SageMaker Studio IDE to Build Your Chatbot

What it is: SageMaker Studio IDE is a cloud-based integrated development environment (IDE) designed for building and deploying machine learning models.
The Why: SageMaker Studio provides the tools and resources you need to develop your chatbot’s core functionality, including natural language processing (NLP) and dialogue management capabilities.
How to do it:

Choose an appropriate Large Language Model (LLM): LLMs are AI models trained on massive amounts of text data, forming the foundation for your chatbot’s ability to understand and respond to user queries. We will be using Claude and Titan Express.
Train the LLM on your security knowledge base: This involves feeding the LLM with the documents stored in S3, allowing it to learn the specific language and concepts related to DevSecOps security.
From the terminal, git clone the following repo: git clone https://github.com/aws-samples/generative-ai-to-build-a-devsecops-chatbot/

From the terminal, you will have to export Kendra’s Index ID, like this: 
export KENDRA_INDEX_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Install requirements:
pip install -r requirements.txt pip install -U langchain-community
And use streamline to design your script’s interface by running the following command: streamlit run app.py titan
Get the URL from this page and remove the /lab path. Instead, add this: /proxy/8501/
Navigate to that URL and you will see your own chatbot live and running, how exciting!

NOTE: If you want to use Claude run this command instead: streamlit run app.py claudeInstant

Why would you want to test different LLM?
Different LLMs have varying strengths and weaknesses. Testing with multiple options helps identify the LLM that best understands the terminology and delivers the most accurate and helpful responses.

Step 5: Remember to clean up your resources to avoid any potential charges.

Leave a Reply

Your email address will not be published. Required fields are marked *