December 24, 2008

Day 24 - Message Brokers

You know that cron job you have that runs every 5 minutes, checks if there is work to do, and does work only if there is work to do? Stop that. The old ways of primitive, distributed processing should be put behind you. Cease the old habit of having your pipeline members check "Is there input?" periodically; whether it's asking mysql for signal data, looking for an empty file as a signal, or whatever. There's a better way: use a message broker.

Let's take a small example. You have a machine database which includes useful data about your hardware such as hardware type, mac addresses, services, etc. You were smart and decided that your dhcp configs would be autogenerated from this database, so now your dhcp server has a cron job that runs every 5 minutes and regenerates the dhcp config and restarts dhcp. Bonus points if you only restart dhcp if the config is actually different.

What you should have done is hooked whatever interface that changes (or permits change) to your machine database into sending a message telling your dhcp server to regenerate it's config. How can we easily do that?

Message brokers act as a channel for processes to communicate with each other easily, and they facilitate reliable, cross-platform, cross-language, cross-network messaging. AMQP has support for a variety of messaging models (see further reading). The messaging model we want here is the 'store and forward' kind, since we only have one writer (machine database) and one reader (dhcp config updater), but if we had more readers on a single channel, we would want a 'publish and subscribe' (pubsub) model. Message brokers support multiple independent channels for your processes to communicate with. For convenience, you choose the name of that channel.

What are our options? AMQP, Advanced Message Queueing Protocol, is a fancy standard that is supported by software called message brokers. Popular message brokers include ActiveMQ, RabbitMQ, and OpenAMQ. In addition to AMQP, there are other protocols designed for messaging, such as JMS and STOMP. STOMP is simple and can work on just about any message broker. With STOMP on ActiveMQ, you can do both queue and topic message models.

I'm assuming you have a message broker that supports STOMP already configured. If you don't, try out ActiveMQ. There are other brokers that support STOMP. Alternately, you can use the StompConnect to add STOMP functionality to anything supporting JMS.

First, we'll want to write the code that sends a message. Since STOMP's message contents are just text, let's send the 'UPDATED' message to notify that the machines database has been modified. Here's an example in ruby:

require "rubygems"
require "stomp"   # install with 'gem install stomp'

# Connect to the stomp server
client = Stomp::Client.open "stomp://mystompserver:5906"

# Send "UPDATED" to the destination '/topic/dhcp'
client.send("/topic/dhcp", "UPDATED");
client.close
Assuming your machines database (or the interface too it) can be told to run this script when a modification occurs, you are halfway to completing this project.

The other part is the receiver. You need a script that will listen for notifications and regenerate the dhcp config as necessary.

require "rubygems" 
require "stomp"

while true do
  client = Stomp::Client.open "stomp://mystompserver:5906"

  client.subscribe("/topic/dhcp", :ack => :client) do |msg|
    if msg.body == "UPDATED"
      system("/usr/local/bin/gendhcpconf")
    end
    client.acknowledge(msg)
  end

  client.join
  client.close
  sleep(10)
done
The receiver code is a bit longer. We subscribe to the same topic as the sender and send the server an acknowledgement that we received the message. The 'client.join' at the end is using a thread join function to wait until the stomp client disconnects or dies. We wrap the whole bit in an infinite loop so we will reconnect in the event that the stomp server dies.

With this configuration, any changes made to your machines database will cause a dhcp config regeneration. The benefits here are two fold: first, that you no longer wake up every 5 minutes trying to do work, and second, that any changes are propogated to your dhcp server immediately (for some small definition of immediately).

Message brokers are extremely useful in helping automate your production systems, especially across platforms and languages. There are STOMP, AMQP, and JMS libraries for just about every language you might use. Lastly, with the availability of free and open source messaging tools and libraries, the cost of deploying a broker is pretty low while the gains in automation and reliability can be high.

Further reading:

Followups:

2 comments :

alexis said...

Actually only RabbitMQ currently supports both AMQP and STOMP. Here is an introduction to RabbitMQ and AMQP -- http://google-ukdev.blogspot.com/2008/09/rabbitmq-tech-talk-at-google-london.html

Cheers, alexis

Jordan Sissel said...

This doc shows how to enable stomp in ActiveMQ:
http://activemq.apache.org/stomp.html

Still, even if you don't have a broker supporting stomp, running StompConnect is trivial.