Security considerations of configuration management

We often take for granted that the security implications of using configuration management tools like Puppet, Chef, or Ansible are obvious, but that’s far from the truth. I would be lying if I said that I’d never deployed dangerous configuration management code into production use that in retrospect should have been obvious from the start. Here’s a great example; the ability to destroy the online Puppet code validator was live for over a year!

So with that in mind, let’s take a wander through the Puppet ecosystem and talk about some things that might have security implications and might warrant a closer look in regard to access control. And let’s not bury the lede here; no matter what framework you’re running, configuration management is by definition root/Administrator level access to your entire infrastructure and access to its setup and codebase should be treated as such. The examples referenced in this guide are specific to Puppet, but the higher-level concepts are not. If you’re using a competing configuration management platform, you should evaluate your usage for similar patterns.

Taking a security mindset here means that this guide will refer to “attack vectors.” This does not mean that there is a vulnerability. It simply means that there might be potential for an attacker to exploit a misconfiguration. And to be clear, this is not intended to be an exhaustive checklist of the things you should avoid; instead, it’s a guide of how to think about code deployed into a configuration management infrastructure.

📝

The basic trust model of Puppet and other configuration management systems is that you are expected to maintain trusted control over your codebase.

In hindsight, that’s kind of obvious — configuration management works by distributing certain kinds of code and data across your infrastructure and executing it with admin privileges to ensure that your systems are configured in the state you want.

But there are a few places where this executable code isn’t obvious. Less scrutiny means that these areas can be used as attack vectors. For example, during catalog compilation, functions are executed on the server. This means any custom functions included in modules, but it also means compiling Ruby code in ERB templates, and shell code run by the generate() function, and so on. The config_version script is executed prior to a catalog compilation; people usually use it to expose git revisions or the like, but it’s just a shell script. It can run anything you want, and it’s right in the control repository. And custom facts sync out and start running with root privileges across your whole infrastructure as soon as you install their modules.

Security recommendations:

When installing new modules you should audit:

Check for custom facts and functions and see what they do.
Check for ERB templates and ensure that they contain only presentation or layout code, such as iterating over an array to build stanzas in a configuration file. Any other Ruby logic is suspect. Check manifests for use of the inline_template() function and validate them in the same way.
Skim the manifests for use of the generate() function. There are extremely few valid use cases for this function, so any sight of it should be a red flag.

When reviewing pull/merge requests to your control repository:

Pay very close attention to any changes to the config_version script.
If new modules are added to the Puppetfile, then verify that they’ve been audited.

The Puppetfile is another surprise. It looks like a data file, and so many people doing code review treat it like data. But it’s actually a custom DSL implemented as Ruby, meaning that it’s possible (although highly discouraged) to include arbitrary code that will run during a codebase deploy. This is a sneaky attack vector because if someone can create a branch in your control repository, then it may be deployed and executed before anyone else can review it.

Security recommendations:

Do not allow untrusted users the ability to create branches in your control repository.
Consider using VCS checks to reject commits containing unexpected code in your Puppetfile. An unsupported and lightly tested example is available here.
When reviewing pull/merge requests to your control repository, pay very close attention to any changes to the Puppetfile and review it like code rather than treating it like simple data.

And of course, any modules you add to your control repo can run any custom extensions, or exec statements, or anything at all anywhere it’s classified onto a node. Malicious modules can theoretically have a higher impact, because they’re invoked as the root or admin user.

Security recommendations:

When installing new modules you should audit:

Skim the manifests to get a general idea of how the module works and what it manages. Pay extra attention to anything that seems out of place.
Look for resources that run shell code, such as exec, or cron jobs, the validate parameter of the file type, or the various command parameters of the service type. Besides looking for malicious code, also inspect for unsafe interpolation that can be used for shell injection attacks.
Check for custom types and providers. These are not as simple to read but you should look for anything that looks out of place or unrelated to the thing it claims to manage.

If you identify any concerns, raise them as issues on the module’s repository or ask community peers about it in our Slack workspace.

Most of this shouldn’t be terribly concerning. Again, automating the execution of code across your infrastructure is what infrastructure automation does. But it is important to remember that and treat your control repository and anything it might contain as privileged code. Audit modules before you use them, or only use modules from trusted authors. Be careful who has access to your control repo, or entries in your Puppetfile. And don’t forget about the various unexpected code execution triggers. Classifying modules onto nodes isn’t the only way to get their code to run.

Remember, this is not an exhaustive list of everything to look for. But I do hope that it gives you an idea of the types of vectors that could be abused by malicious actors and some good habits to get into. What other safeguards do you have protecting your codebase? Drop them in the comments!