December 24, 2012

Day 24 - Twelve things you may not know about Chef

This was written by Joshua Timberman.

In this post, we will discuss a number of features that can be used in managing systems with Chef, but may be overlooked by some users. We'll also look at some features that are not so commonly used, and may prove helpful.

Here's a table of contents:

  1. Resources are first class citizens
  2. In-place file editing
  3. File Checksum comparisons
  4. Version matching
  5. Encrypting Data for Chef's Use
  6. Chef has a REPL
  7. Working with the Resource Collection
  8. Extending the Recipe DSL with helpers
  9. Load and execute a single recipe
  10. Integrating Chef with Your Tools
  11. Sending information to various places
  12. Tagging nodes

(1) Resources are first class citizens

This is probably something most readers who are familiar with Chef already do know. However, we do encounter some uses of Chef that indicate that the author didn't know this. For example, this is from an actual recipe I have seen:

execute "yum install foo" do
  not_if "rpm -qa | grep '^foo'"

execute "/etc/init.d/food start" do
  not_if "ps awux | grep /usr/sbin/food"

This totally works, assuming that the grep doesn't turn up a false positive (someone reading the 'food' man page?). However, there are resources for this pattern kind of thing, so it's best to use them instead:

package "foo" do
  action :install

service "food" do
  action :start

Core Chef Resources

Chef comes with a great many resources. These are for managing common components of operating systems, but also primitives that can be used to use on their own, or compose new resources.

Some common resources:

These actually make up probably 80% or more of the resources people will use. However, Chef comes with a few other resources that are less commonly used but still highly useful.

The scm resource has two providers, git and subversion, which can be used as the resource type. These are useful if a source repository must be checked out. For example, myproject is in subversion, and your project is in git.

subversion "myproject" do
  repository "svn://"
  destination "/opt/share/myproject"
  revision "HEAD"
  action :checkout

git "yourproject" do
  repository "git://"
  destination "/usr/local/src/yourproject"
  reference "1.2.3" # some tag
  action :checkout

This is used under the covers in the deploy resource.

The ohai resource can be used to reload attributes on the node that come from Ohai plugins.

For example, we can create a user, and then tell ohai to reload the plugin that has all user and group information.

ohai "reload_passwd" do
  action :nothing
  plugin "passwd"

user "daemonuser" do
  home "/dev/null"
  shell "/sbin/nologin"
  system true
  notifies :reload, "ohai[reload_passwd]", :immediately

Or, we can drop off a new plugin as a template, and then load that plugin.

ohai "reload_nginx" do
  action :nothing
  plugin "nginx"

template "#{node['ohai']['plugin_path']}/nginx.rb" do
  source "plugins/nginx.rb.erb"
  owner "root"
  group "root"
  mode 00755
  notifies :reload, 'ohai[reload_nginx]', :immediately

If your recipe(s) manipulate system state that future resources need to be aware of, this can be quite helpful.

The http_request resource makes... an HTTP request. This can be used to send (or receive) data via an API.

For example, we can send a request to retrieve some information:

http_request "some_message" do
  url ""

But more usefully, we can send a POST request. For example, on a Chef Server with CouchDB (Chef 10 and earlier), we can compact the database:

http_request "compact chef couchDB" do
  url "http://localhost:5984/chef/_compact"
  action :post

If you're building a custom lightweight resource/provider for an API service like a monitoring system, this could be a helpful primitive to build upon.

Opscode Cookbooks

Aside from the resources built into Chef, Opscode publishes a number of cookbooks that contain custom resources, or "LWRPs". See the README for these cookbooks for examples.

There's many more, and documentation for them is on the Opscode Chef docs site.

(2) In-place file editing

For a number of reasons, people may need to manage the content of files by replacing or adding specific lines. The common use case is something like sysctl.conf, which may have different tuning requirements from different applications on a single server.

This is an anti-pattern

Many folks who practice configuration management see this as an anti-pattern, and recommend managing the whole file instead. While that is ideal, it may not make sense for everyone's environment.

But if you really must...

The Chef source has a handy utility library to provide this functionality, Chef::Util::FileEdit. This provides a number of methods that can be used to manipulate file contents. These are used inside a ruby_block resource so that the Ruby code is done during the "execution phase" of the Chef run.

ruby_block "edit etc hosts" do
  block do
    rc ="/etc/hosts")
      /^127\.0\.0\.1 localhost$/,
      " #{new_fqdn} #{new_hostname} localhost"

For another example, Sean OMeara has written a line that includes a resource/provider to append a line in a file if it doesn't exist.

(3) File Checksum comparisons

In managing file content with the file, template, cookbook_file, and remote_file resources, Chef compares the content using a SHA256 checksum. This class can be used in your own Ruby programs or libraries too. Sure, you can use the "sha256sum" command, but this is native Ruby instead of shelling out.

The class to use is Chef::ChecksumCache and the method is #checksum_for_file.

require 'chef/checksum_cache'
sha256 = Chef::ChecksumCache.checksum_for_file("/path/to/file")

(4) Version matching

It is quite common to need version string comparison checks in recipes. Perhaps we want to match the version of the platform this node is running on. Often we can simply use a numeric comparison between floating point numbers or strings:

if node['platform_version'].to_f == 10.04
if node['platform_version'] == "6.3"

However, sometimes we have versions that use three points, and matching on the third portion is relevant. This would get lost in #to_f, and greater/less than comparisons may not match with strings.


The [Chef::VersionConstraint]( class can be used for version comparisons. It is modeled after the version constraints in Chef cookbooks themselves.

First we initialize the Chef::VersionConstraint with an argument containing the comparison operator and the version as a string. Then, we send the #include? method with the version to compare as an argument. For example, we might be checking that the version of OS X is 10.7 or higher (Lion).

require 'chef/version_constraint'">= 10.7.0").include?("10.6.0") #=> false">= 10.7.0").include?("10.7.3") #=> true">= 10.7.0").include?("10.8.2") #=> true

Or, in a Chef recipe we can use the node's platform version attribute. For example, on a CentOS 5.8 system:"~> 6.0").include?(node['platform_version']) # false

But on a CentOS 6.3 system:"~> 6.0").include?(node['platform_version']) # true

Chef's version number is stored as a node attribute (node['chef_packages']['chef']['version']) that can be used in recipes. Perhaps we want to check for a particular version because we're going to use a feature in the recipe only available in newer versions.

version_checker =">= 0.10.10")
mac_service_supported = version_checker.include?(node['chef_packages']['chef']['version'])

if mac_service_supported
  # do mac service is supported so do these things

(5) Encrypting Data for Chef's Use

By default, the data stored on the Chef Server is not encrypted. Node attributes, while containing useful data, are plaintext for anyone that has a private key authorized to the Chef Server. However, sometimes it is desirable to store encrypted data, and Data Bags (stores of arbitrary JSON data) can be encrypted.

You'll need a secret key. This can be a phrase or a file. The key needs to be available on any system that will need to decrypt the data. A cryptographically strong secret key is best, and can be generated with OpenSSL:

openssl rand -base64 512 > ~/.chef/encrypted_data_bag_secret

Next, create the data bag that will contain encrypted items. For example, I'll use secrets.

knife data bag create secrets

Next, create the items in the bag that will be encrypted.

knife data bag create secrets credentials --secret-file ~/.chef/encrypted_data_bag_secret
  "id": "credentials",
  "user": "joshua",
  "password": "dirty_secrets"

Then, view the content of the data bag item:

knife data bag show secrets credentials
id:        credentials
password:  cKZgOISOE+lmRiqf9j5LlRegtcILqvVw6XRft11T7Pg=

user:      mBf1UDwAGq0N0Ohqugabfg==

Naturally, this is encrypted using the secret file. Decrypt it:

knife data bag show secrets credentials --secret-file ~/.chef/encrypted_data_bag_secret
id:        credentials
password:  dirty_secrets
user:      joshua

To use this data in a recipe, the secret file must be copied and its location configured in Chef. The knife bootstrap command can do this automatically if your knife.rb contains the encrypted_data_bag_secret configuration. Presuming that the .chef directory contains the knife.rb and the above secret file:

encrypted_data_bag_secret "./encrypted_data_bag_secret"

In a Recipe, Chef::EncryptedDataBagItem

Nodes bootstrapped using the default bootstrap template will have the secret key file copied to /etc/chef/encrypted_data_bag_secret, and available for Chef. This is a constant in the Chef::EncryptedDataBagItem class, DEFAULT_SECRET_FILE. To use this in a recipe, use the #load_secret method, then pass that as an argument to the #load method for the data bag item. Finally, access various keys from the item like a Ruby Hash. Example below:

secret = Chef::EncryptedDataBagItem.load_secret(Chef::EncryptedDataBagItem::DEFAULT_SECRET_FILE))
user_creds = Chef::EncryptedDataBagItem.load("secrets","credentials", secret)
user_creds['id'] # => "credentials"
user_creds['user'] # => "joshua"
user_creds['password'] # => "dirty_secrets"

(6) Chef has a REPL

Chef comes with a built-in "REPL" or shell, called shef. A REPL is "Read, Eval, Print, Loop" or "read what I typed in, evaluate it, print out the results, and do it again." Other examples of REPLs are Python's python w/ no arguments, a Unix shell, or Ruby's irb.

shef (chef-shell in Chef 11)

In Chef 10 and earlier, the Chef REPL is invoked as a binary named shef. In Chef 11 and later, it is renamed to chef-shell. Additional options can be passed to the command-line, including a config file to use, or an over all mode to use (solo or client/server). See shef --help for options.

Once invoked, shef has multiple run-time contexts that can be used:

  • main
  • recipe (recipe_mode in Chef 11)
  • attributes (attributes_mode in Chef 11)

At any time, you can type "help" to get context specific help. The "main" context provides a number of API helper methods. The "attributes" context functions as a cookbook's attributes file. The "recipe" context is in the Chef recipe DSL context, where resources can be created and run. For example:

chef:recipe > package "zsh" do
chef:recipe >   action :install
chef:recipe ?> end
 => <package[zsh] @name: "zsh" @package_name: "zsh" @resource_name: :package >

(the output is trimmed for brevity, try it on your own system)

This works similar to how Chef actually works when processing recipes. It has recognized the input as a Chef Resource and added it to the resource collection. This doesn't actually manage the resource until we enter the execution phase, similar to a Chef run. We can do that with the shef method run_chef:

chef:recipe > run_chef
[2012-12-23T12:32:27-07:00] INFO: Processing package[zsh] action install ((irb#1) line 1)
[2012-12-23T12:32:27-07:00] DEBUG: package[zsh] checking package status for zsh
  Installed: 4.3.17-1ubuntu1
  Candidate: 4.3.17-1ubuntu1
  Version table:
 *** 4.3.17-1ubuntu1 0
        500 precise/main amd64 Packages
        100 /var/lib/dpkg/status
[2012-12-23T12:32:27-07:00] DEBUG: package[zsh] current version is 4.3.17-1ubuntu1
[2012-12-23T12:32:27-07:00] DEBUG: package[zsh] candidate version is 4.3.17-1ubuntu1
[2012-12-23T12:32:27-07:00] DEBUG: package[zsh] is already installed - nothing to do
 => true

There are many possibilities for debugging and exploring with this tool. For example, use it to test the examples that are presented in this post.

chef/shef/ext (renamed in Chef 11)

The methods available in the "main" context of Shef are also available to your own scripts and plugins by requiring Chef::Shef::Ext. In Chef 11, this will be Chef::Shell::Ext, though the old one is present for compatibility.

require 'chef/shef/ext'
nodes.all # => [node[doppelbock], node[cask], node[ipa]]

(7) Working with the Resource Collection

One of the features of Chef is that Recipes are pure Ruby. As such, we can manipulate things that are in the Object Space, such as other Chef objects. One of these is the Resource Collection, the data structure that contains all the resources that have been seen as Chef processes recipes. Using shef, or any Chef recipe, we can work with the resource collection for a variety of reasons.

Look Up Another Resource

The #resources method will return an array of all the resources. From our shef session earlier, we have a single resource:

chef:recipe > resources

We can add others.

chef:recipe > service "food"
chef:recipe > file "/tmp/food-zsh-completion"

Now when we look at the resource collection, we'll see the new resources:

chef:recipe > resources
["package[zsh]", "service[food]", "file[/tmp/food-zsh-completion]"]

We can use the resources method to open a specific resource.

"Re-Open" Resources to Modify/Override

If we look at the service[food] resource that was created (using all default parameters), we'll see:

chef:recipe > resources("service[food]")
<service[food] @name: "food" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :enable, :disable, :start, :stop, :restart, :reload] @action: "nothing" @updated: false @updated_by_last_action: false @supports: {:restart=>false, :reload=>false, :status=>false} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):2:in `irb_binding'" @elapsed_time: 0 @resource_name: :service @service_name: "food" @enabled: nil @running: nil @parameters: nil @pattern: "food" @start_command: nil @stop_command: nil @status_command: nil @restart_command: nil @reload_command: nil @priority: nil @startup_type: :automatic @cookbook_name: nil @recipe_name: nil>

To work with this, it is easier to assign to a local variable.

chef:recipe > f = resources("service[food]")

Then, we can call the various parameters as accessor methods.

chef:recipe > f.supports
 => {:restart=>false, :reload=>false, :status=>false}

We can modify this by sending the supports method to f with additional arguments. For example, maybe the food service supports restart and status commands, but not reload:

chef: recipe > f.supports({:restart => true, :status => true})
 => {:restart=>true, :status=>true}

As a more practical example, perhaps you want to use a cookbook from the Chef Community Site that manages a couple services on Ubuntu. However, the author of the cookbook hasn't updated the cookbook in a while, and those services are managed by upstart instead of being init.d scripts. You could create a custom cookbook that "wraps" the upstream cookbook with a recipe like this to modify those service resources:

if platform?("ubuntu")
  ["service_one, "service_two].each do |s|
    srv = resource("service[#{s}]")
    srv.provider Chef::Provider::Service::Upstart
    srv.start_command "/usr/bin/service #{s} start"

Then in the node's run list, you'd have the upstream cookbook's recipe and your custom recipe:

  "run_list": [

This is a pattern that has become popular with the idea of "Library" vs. "Application" cookbooks, and Bryan Berry has a RubyGem to provider a helper for it.

(8) Extending the Recipe DSL with helpers

One of the features of a Chef cookbook is that it can contain a "libraries" directory with files containing helper libraries. These can be new Chef Resources/Providers, ways of interacting with third party services, or simply extending the Chef Recipe DSL.

Let's just have a simple method that shortcuts the Chef version attribute so we don't have to type the whole thing in our recipes.

First, create a cookbook named "my_helpers".

knife cookbook create my_helpers

Then create the library file. This can be anything you want, all library files are loaded by Chef.

touch cookbooks/my_helpers/libraries/default.rb

Then, since we are extending the Chef Recipe DSL, add this method to its class, Chef::Recipe.

class Chef
  class Recipe
    def chef_version

To use this in a recipe, simply call that method. From the earlier example:

mac_service_supported = version_checker.include?(chef_version)

Next, I'll use a helper library for the Encrypted Data Bag example from earlier to demonstrate this. I created a separate library file.

touch cookbooks/my_helpers/libraries/encrypted_data_bag_item.rb

It contains:

class Chef
  class Recipe
    def encrypted_data_bag_item(bag, item, secret_file = Chef::EncryptedDataBagItem::DEFAULT_SECRET_FILE)
      secret = EncryptedDataBagItem.load_secret(secret_file)
      EncryptedDataBagItem.load(bag, item, secret)
    rescue Exception
      Log.error("Failed to load data bag item: #{bag.inspect} #{item.inspect}")

Now, when I want to use it in a recipe, I can:

user_creds = encrypted_data_bag_item("secrets", "credentials)

(9) Load and execute a single recipe

In default operation, Chef loads cookbooks and recipes from their directories on disk. It is actually possible to load a single recipe file by composing a new binary program from Chef's built-in classes. This is helpful for simple use cases or as a general example. Dan DeLeo of Opscode wrote this as a gist awhile back, which I've updated here:

It's only 45 lines counting whitespace. Simply save that to a file, and then create a recipe file, and run it with the filename as an argument.

root@virt1test:~# wget
--2012-12-23 13:56:32--
2012-12-23 13:56:32 (137 MB/s) - `chef-apply.rb' saved [848]

root@virt1test:~# chmod +x chef-apply.rb
root@virt1test:~# ./chef-apply.rb recipe.rb
[2012-12-23T13:56:54-07:00] INFO: Run List is []
[2012-12-23T13:56:54-07:00] INFO: Run List expands to []
[2012-12-23T13:56:54-07:00] INFO: Processing package[zsh] action install ((chef-apply cookbook)::(chef-apply recipe) line 1)
[2012-12-23T13:56:54-07:00] INFO: Processing package[vim] action install ((chef-apply cookbook)::(chef-apply recipe) line 2)
[2012-12-23T13:56:54-07:00] INFO: Processing file[/tmp/stuff] action create ((chef-apply cookbook)::(chef-apply recipe) line 3)

This is the simple recipe:

package "zsh"
package "vim"

file "/tmp/stuff" do
  content "I have some stuff I'm stashing in here."

This functionality is quite useful for example purposes, and a ticket (CHEF-3571) was created to track its addition for core Chef.

(10) Integrating Chef with Your Tools

There's a rising ecosystem of tools surrounding chef. Many of them use the Chef REST API to expose cool functionality and let you build your own tooling on top.

spice and ridley (ruby)

spice and ridley provide ruby APIs that talk to Chef.

pychef (python)

pychef gives you a nice api for hitting the Chef API from python.

jclouds (java/clojure)

jclouds has a chef component to let you use the Chef REST api from Java and Clojure. Learn more here

(11) Sending information to various places

Chef has the ability to send output to a variety of places. By default, it will output to standard out. This is managed through the Chef logger, a class called Chef::Log.

The Chef::Log Configuration

The Chef::Log logger has three main configuration options:

  • log_level: the amount of log output to display. Default is "info", but "debug" is common.
  • log_location: where the log output should go. Default is standard out.
  • verbose_logging: whether to display "Processing:" messages for each resource Chef processes. Default is true.

The first two are configurable with command-line options, or in the configuration file. The level is the -l (small ell) option, and the location is the -L (big ell) option.

chef-client -l debug -L debug-output.log

In the configuration file, the level should be specified as a symbol (preceding colon), and the location as a string or constant (if using standard out).

log_level :info
log_location STDOUT


log_level :debug
log_location "/var/log/chef/debug-output.log"

The verbose output option is in the configuration file. To suppress "Processing" lines, set it to false.

verbose_logging false

Output Formatters

A new feature for log output introduced in Chef 10.14 is "Output Formatters". These can be set with the -F option, or the formatter configuration option. There are some formatters included in Chef:

  • base: the default
  • doc: nicely presented "documentation" type output
  • min: rspec style minimal output

For example, to use the doc style but only for one run:

chef-client -F doc -l fatal

Use the log level fatal so normal logger messages aren't displayed. To make this permenant for all runs, put it in the config file.

log_level :fatal
formatter "doc"

You can create your own formatters, too. An example of this is Andrea Campi's nyan cat formatter. You can deploy this and use it with Sean OMeara's cookbook.

Report/Exception Handlers

Chef has an API for running report/exception handlers at the end of a Chef run. These can display information about the resources that were updated, any exception that occurred, or other data about the run itself. The handlers themselves are Ruby classes that inherit from Chef::Handler, and then override the report method to perform the actual reporting work. Chef handlers can be distributed as RubyGems, or single files.


Chef becomes aware of the report or exception handlers through the configuration file. For example, if I wanted to use the updated_resources handler that I wrote as a RubyGem, I would install the gem on the system, and then put the following in my /etc/chef/client.rb.

require "chef/handler/updated_resources"
report_handlers <<
exception_handlers <<

Then at the end of the run, the report would print out the resources that were updated.

chef_handler Cookbook

For handlers that are simply a single file, use Opscode's chef_handler cookbook. It will automatically handle putting the handlers in place on the system, and adding them to the configuration.

Other Handlers

A number of Chef handlers are available from the community and many are listed on the Exception and Report Handlers page. Conventionally, authors often prepend chef-handler to their gem names to make them easier to find. Some common ones you may find useful:

(12) Tagging nodes

A feature that has existed in Chef since its initial release is "node tagging". This is simply a node attribute built in where entries can be added and removed, or queried easily.

Use cases

One can certainly use other node attributes for storing data. Since node attributes can be any JSON object type, arrays are easily available. Howeer, "tags" have some special helpers available, and semantic uses that may make more sense than plain attributes.

Part of the idea is that tags may be added or removed, flipping the node to various states as far as the Chef Server is concerned. For example, one might only want to monitor nodes that have a certain tag, or run data base migrations on a node tagged to do so.

Tags in Chef Recipes

In Chef recipes, we can search for nodes that have a particular tag. Perhaps nodes tagged "decommissioned" shouldn't be monitored.

decommissioned_nodes = search(:node, "tags:decommissioned")

The recipe DSL itself has some tag-specific helper methods, too.

Use tagged? to see if the node running Chef has a specific tag:

if tagged?("decommissioned")
  raise "Why am I running Chef if I'm decommissioned?"

Perhaps more usefully:

if tagged?("run_migrations")
  execute "rake db:migrate" do
    cwd "/srv/myapp/current"

If the tags of the node need to be modified during a run, that can be done with the tag and untag methods.

log "I'm printed if the tag deployed is set." do
  only_if { tagged?("deployed") }

Or perhaps more usefully, untag the node after the migrations from earlier are run:

if tagged?("run_migrations")
  execute "rake db:migrate" do
    cwd "/srv/myapp/current"
    notifies :create, "ruby_block[untag-run-migrations]", :immediately

ruby_block "untag-run-migrations" do
  block do
  only_if { tagged?("run_migrations") }

Knife Commands

There are knife commands for viewing and manipulating node tags.

View the tags of a node:

knife tag list

Add a tag to a node:

knife tag create powered_off
Created tags powered_off for node

Remove a tag from a node:

knife tag delete powered_off
Deleted tags powered_off for node


Hopefully this post contains a number of things you didn't know were available to Chef, and will be useful in your Chef environment.


Doug Ireton said...

Super helpful Chef tips. Thanks Joshua!

JB said...

You should work at Opscode...

jakshi said...

I'm chum that sometimes use ani-pattern "In-place file editing". Thank you a lot for your tips, they are mega-helpful.