December 19, 2013

Day 19 - Automating IAM Credentials with Ruby and Chef

Written by: Joshua Timberman (@jtimberman)
Edited by: Shaun Mouton (@sdmouton)

Chef, nee Opscode, has long used Amazon Web Services. In fact, the original iteration of "Hosted Enterprise Chef," "The Opscode Platform," was deployed entirely in EC2. In the time since, AWS has introduced many excellent features and libraries to work with them, including Identity and Access Management (IAM), and the AWS SDK. Especially relevant to our interests is the Ruby SDK, which is available as the aws-sdk RubyGem. Additionally, the operations team at Nordstrom has released a gem for managing encrypted data bags called chef-vault. In this post, I will describe how we use the AWS IAM feature, how we automate it with the aws-sdk gem, and store secrets securely using chef-vault.


First, here are a few definitions and references for readers.
  • Hosted Enterprise Chef - Enterprise Chef as a hosted service.
  • AWS IAM - management system for authentication/authorization to Amazon Web Services resources such as EC2, S3, and others.
  • AWS SDK for Ruby - RubyGem providing Ruby classes for AWS services.
  • Encrypted Data Bags - Feature of Chef Server and Enterprise Chef that allows users to encrypt data content with a shared secret.
  • Chef Vault - RubyGem to encrypt data bags using public keys of nodes on a chef server.

How We Use AWS and IAM

We have used AWS for a long time, before the IAM feature existed. Originally with The Opscode Platform, we used EC2 to run all the instances. While we have moved our production systems to a dedicated hosting environment, we do have non-production services in EC2. We also have some external monitoring systems in EC2. Hosted Enterprise Chef uses S3 to store cookbook content. Those with an account can see this with knife cookbook show COOKBOOK VERSION, and note the URL for the files. We also use S3 for storing the packages from our omnibus build tool. The omnitruck metadata API service exposes this.

All these AWS resources - EC2 instances, S3 buckets - are distributed across a few different AWS accounts. Before IAM, there was no way to have data segregation because the account credentials were shared across the entire account. For (hopefully obvious) security reasons, we need to have the customer content separate from our non-production EC2 instances. Similarly, we need to have the metadata about the omnibus packages separate from the packages themselves. In order to manage all these different accounts and their credentials which need to be automatically distributed to systems that need them, we use IAM users, encrypted data bags, and chef.

Unfortunately, using various accounts adds complexity in managing all this, but through the tooling I'm about to describe, it is a lot easier to manage now than it was in the past. We use a fairly simple data file format of JSON data, and a Ruby script that uses the AWS SDK RubyGem. I'll describe the parts of the JSON file, and then the script.

IAM Permissions

IAM allows customers to create separate groups which are containers of users to have permissions to different AWS resources. Customers can manage these through the AWS console, or through the API. The API uses JSON documents to manage the policy statement of permissions the user has to AWS resources. Here's an example:
  "Statement": [
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": [
Granted to an IAM user, this will allow that user to perform all S3 actions to the bucket an-s3-bucket and all the files it contains. Without the /*, only operations against the bucket itself would be allowed. To set read-only permissions, use only the List and Get actions:
"Action": [
Since this is JSON data, we can easily parse and manipulate this through the API. I'll cover that shortly.

See the IAM policy documentation for more information.

Chef Vault

We use data bags to store secret credentials we want to configure through Chef recipes. In order to protect these secrets further, we encrypt the data bags, using chef-vault. As I have previously written about chef-vault in general, this section will describe what we're interested in from our automation perspective.

Chef vault itself is concerned with three things:
  1. The content to encrypt.
  2. The nodes that should have access (via a search query).
  3. The administrators (human users) who should have access.
"Access" means that those entities are allowed to decrypt the encrypted content. In the case of our IAM users, this is the AWS access key ID and the AWS secret access key, which will be the content to encrypt. The nodes will come from a search query to the Chef Server, which will be added as a field in the JSON document that will be used in a later section. Finally, the administrators will simply be the list of users from the Chef Server.

Data File Format

The script reads a JSON file, described here:
  "accounts": [
  "user": "secret-files",
  "group": "secret-files",
  "policy": {
    "Statement": [
        "Action": "s3:*",
        "Effect": "Allow",
        "Resource": [
  "search_query": "role:secret-files-server"
This is an example of the JSON we use. The fields:
  • accounts: an array of AWS account names that have authentication credentials configured in ~/.aws/config - see my post about managing multiple AWS accounts
  • user: the IAM user to create.
  • group: the IAM group for the created user. We use a 1:1 user:group mapping.
  • policy: the IAM policy of permissions, with the action, the effect, and the AWS resources. See the IAM documentation for more information about this.
  • search_query: the Chef search query to perform to get the nodes that should have access to the resources. For example, this one will allow all nodes that have the Chef role secret-files-server in their expanded run list.
These JSON files can go anywhere, the script will take the file path as an argument.

Create IAM Script

Note This script is cleaned up to save space and get to the meat of it. I'm planning to make it into a knife plugin but haven't gotten a round tuit yet.
require 'inifile'
require 'aws-sdk'
require 'json'
filename = ARGV[0]
dirname  = File.dirname(filename)
aws_data = JSON.parse(
aws_data['accounts'].each do |account|
  aws_creds = {}
  aws_access_keys = {}
  # load the aws config for the specified account
  IniFile.load("#{ENV['HOME']}/.aws/config")[account].map{|k,v| aws_creds[k.gsub(/aws_/,'')]=v}
  iam =
  # Create the group
  group = iam.groups.create(aws_data['group'])
  # Load policy from the JSON file
  policy = AWS::IAM::Policy.from_json(aws_data['policy'].to_json)
  group.policies[aws_data['group']] = policy
  # Create the user
  user = iam.users.create(aws_data['user'])
  # Add the user to the group
  # Create the access keys
  access_keys = user.access_keys.create
  aws_access_keys['aws_access_key_id'] = access_keys.credentials.fetch(:access_key_id)
  aws_access_keys['aws_secret_access_key'] = access_keys.credentials.fetch(:secret_access_key)
  # Create the JSON content to encrypt w/ Chef Vault
  vault_file ="#{File.dirname(__FILE__)}/../data_bags/vault/#{account}_#{aws_data['user']}_unencrypted.json", 'w')
  vault_file.puts JSON.pretty_generate(
      'id' => "#{account}_#{aws_data['user']}",
      'data' => aws_access_keys,
      'search_query' => aws_data['search_query']
  # This would be loaded directly with Chef Vault if this were a knife plugin...
  puts <<-eoh data-blogger-escaped---admins="" data-blogger-escaped---json="" data-blogger-escaped---mode="" data-blogger-escaped---search="" data-blogger-escaped--="" data-blogger-escaped--sd="" data-blogger-escaped-account="" data-blogger-escaped-admins="" data-blogger-escaped-aws_data="" data-blogger-escaped-be="" data-blogger-escaped-client="" data-blogger-escaped-code="" data-blogger-escaped-create="" data-blogger-escaped-data_bags="" data-blogger-escaped-encrypt="" data-blogger-escaped-end="" data-blogger-escaped-eoh="" data-blogger-escaped-humans="" data-blogger-escaped-knife="" data-blogger-escaped-list="" data-blogger-escaped-of="" data-blogger-escaped-paste="" data-blogger-escaped-search_query="" data-blogger-escaped-should="" data-blogger-escaped-unencrypted.json="" data-blogger-escaped-user="" data-blogger-escaped-vault="" data-blogger-escaped-who="">
This is invoked with:
% ./create-iam.rb ./iam-json-data/filename.json
The script iterates over each of the AWS account credentials named in the accounts field of the JSON file named, and loads the credentials from the ~/.aws/config file. Then, it uses the aws-sdk Ruby library to authenticate a connection to AWS IAM API endpoint. This instance object, iam, then uses methods to work with the API to create the group, user, policy, etc. The policy comes from the JSON document as described above. It will create user access keys, and it writes these, along with some other metadata for Chef Vault to a new JSON file that will be loaded and encrypted with the knife encrypt plugin.

As described, it will display a command to copy/paste. This is technical debt, as it was easier than directly working with the Chef Vault API at the time :).

Using Knife Encrypt

After running the script, we have an unencrypted JSON file in the Chef repository's data_bags/vault directory, named for the user created, e.g., data_bags/vault/secret-files_unencrypted.json.
  "id": "secret-files",
  "data": {
    "aws_access_key_id": "the access key generated through the AWS API",
    "aws_secret_access_key": "the secret key generated through the AWS API"
  "search_query": "roles:secret-files-server"
The knife encrypt command is from the plugin that Chef Vault provides. The output of the create-iam.rb script outputs how to use this:
% knife encrypt create vault an-aws-account-name_secret-files \
  --search 'roles:secret-files-server' \
  --mode client \
  --json data_bags/vault/an-aws-account-name_secret-files_unencrypted.json \
  --admins "`knife user list | paste -sd ',' -`"


After running the create-iam.rb script with the example data file, and the unencrypted JSON output, we'll have the following:
  1. An IAM group in the AWS account named secret-files.
  2. An IAM user named secret-files added to the secret-files.
  3. Permission for the secret-files user to perform any S3 operations
    on the secret-files bucket (and files it contains).
  4. A Chef Data Bag Item named an-aws-account-name_secret-files in the vault Bag, which will have encrypted contents.
  5. All nodes matching the search roles:secret-files-server will be present as clients in the item an-aws-account-name_secret-files_keys (in the vault bag).
  6. All users who exist on the Chef Server will be admins in the an-aws-account-name_secret-files_keys item.
To view AWS access key data, use the knife decrypt command.
% knife decrypt vault secret-files data --mode client
    data: {"aws_access_key_id"=>"the key", "aws_secret_access_key"=>"the secret key"}
The way knife decrypt works is you give it the field of encrypted data to encrypt which is why the unencrypted JSON had a field named data created - so we could use that to access any of the encrypted data we wanted. Similarly, we could use search_query instead of data to get the search query used, in case we wanted to update the access list of nodes.

In a recipe, we use the chef-vault cookbook's chef_vault_item helper method to access the content:
require 'chef-vault'
aws = chef_vault_item('vault', 'an-aws-account_secret-files')['data']


I wrote this script to automate the creation of a few dozen IAM users across several AWS accounts. Unsurprisingly, it took longer to test the recipe code and access to AWS resources across the various Chef recipes, than it took to write the script and run it.

Hopefully this is useful for those who are using AWS and Chef, and were wondering how to manage IAM users. Since this is "done" I may or may not get around to releasing a knife plugin.

No comments :