Log CloudTrail events to DynamoDB using AWS State Machine

RMAG news

Introduction

If you’re ever browsing through the different AWS services and offerings, you might have come across AWS State Machines and gotten scared by the name. In reality, the state machine service allows you to chain events easily and seamlessly. In addition to its usage to chain other AWS services together, State Machines have some cool integrations up their sleeves, which can save you from writing your own logic. Less moving parts means less stuff to break!

This will be a two part blog, where each part will focus on solving a real world issue using State Machines.

For this blog, I will be using a trick that allows a state machine to take in an input, and write it directly to DynamoDB, without you having to write your own database insert logic. This can be used for a real world scenario where you need to log certain events from CloudTrail to a DynamoDB table.

AWS Resources needed

In order to complete this demo, I’m going to need to create a few AWS resources:

A DynamoDB table: this will be used to save the event.
A state machine: for obvious reasons :).
An EventBridge rule: this will be used to transfer the events from CloudTrail to our state machine.

1 – DynamoDB

First off, we need to create our DynamoDB table. For my example, I’m logging the event as it is, and I’m using the EventID, which is readily available as part of the event, as my Partition Key.

Here is the terraform snippet of my DyanmoDB table. Please be aware that on-demand mode is probably not the most economical billing mode:

resource “aws_dynamodb_table” “state_machine_demo_table” {
name = “state-machine-demo-table”
tags = var.tags
billing_mode = “PAY_PER_REQUEST” # On-demand
hash_key = “EventID”

attribute {
name = “EventID”
type = “S”
}
}

2 – state machine

Our state machine definition will be quite simple. Let’s look at it and break it down.

{
“Comment”: “transfer CloudTrail events to DynamoDB.”,
“StartAt”: “WriteToDynamoDB”,
“States”: {
“WriteToDynamoDB”: {
“Type”: “Task”,
“Resource”: “arn:aws:states:::dynamodb:putItem”,
“Parameters”: {
“TableName”: “state-machine-demo-table”,
“Item”: {
“EventID”: {
“S.$”: “$.detail.eventID”
},
“EventData”: {
“S.$”: “States.JsonToString($.detail)”
}
}
},
“End”: true
}
}
}

The state machine only has one task, which is to write to our DB. We use the states:::dynamodb:putItem resource, which is the resource responsible for the native integration between the state machine and DynamoDB.

In the parameters sections, we specify the table name, and the Item. Each Item element will become a column in our DB row. Since EVENTID is our Partition key, this one is mandatory. Note that you can create as many items as you want depending on your need.

If you look closely at the annotation, you’ll see this : “S.$”: “$.detail.eventID”. What this means is that We’re going to insert an item of type string (hence the S at the beginning). All CloudTrail events nest their eventID under the “detail” section, which is why we’re retrieving it there. That is to say that you can perform logic on your Json object in the state machine so that you only retrieve the data you want. This is a really powerful feature, and a necessary one if you plan on inserting complex data.

The Other item is our Event data, turned to a string via the built-in function States.JsonToString().

3 – EventBridge

Quite simply, this is an event bridge rule that will have the state machine as a target. The event pattern can be whatever events you would like to catch. For the sake of this example, I added 2 events (CreateSubnet and DeleteSubnet).

This is what my event pattern looks like:

{
“detail-type”: [
“AWS API Call via CloudTrail”
],
“detail”: {
“eventName”: [
“CreateSubnet”,
“DeleteSubnet”,
]
}
}

When creating your EventBridge rule, remember to associate it to the Default EventBus. This will ensure that the eventbridge rule picks up the CloudTrail events. Also, remember to set your state machine as the target of the rule.

Conclusion

With this, we have just created an AWS State Machine that takes in CloudTrail events as an input, and directly writes them to a DynamoDB table. This can be changed to a more complex architecture where your data is coming in from another source, and where you retrieve only certain data in your state machine.

Part 2

In part 2 of this blog, I’m going to show you how to chain multiple lambda functions, so that the output of one is fed as an input to the other. More importantly, I’m going to show you how you can trigger parallel executions of your lambda function, such that for a list of objects, a lambda is triggered for each one. This trick will help you in processing data in parallel. Check it out here!

Please follow and like us:
Pin Share