December 1, 2015

Day 1 - Using Automation to build an OpenStack Cloud

Written by: JJ Asghar, @jjasghar

Edited by: Klynton Jessup, @klyntonj

I wrote this as a narrative around what I hope a typical engineer would experience trying to resolve an issue. This story, while fiction, is taken from personal experiences and inspired by what I hope would happen. I hope you enjoy this.

Like normal, I came to my stand-up unexcited for the normal grind. My boss Steve came in, sat down at the conference table, and put his notebook down. Yes, stand-up happened in a conference room and yes we actually sat at the conference room table, so, honestly, I never really understood why we called it a stand-up. I guess it was a “rebranding” of “Daily Status” or maybe it was a hold over from those days we tried to do “agile.” Who knows?

Anyway, Steve opened his notebook and looked around at my team. “So, we have a problem. The local development for our cookbooks is great but we need to start testing on multiple platforms. There’s a chance we might be spinning up a new application and it only runs on CentOS.” (We’re a Ubuntu shop.) There were some sighs and groans as we looked around at each other.

“Today y’all have your normal responsibilities but we need to think of a way to parallelize our cookbook development. I dunno if you’ve ever tried it but running kitchen test -p 2 on your laptops brings everything to a grinding halt. Let’s skip stand-up and spend the first half of the day doing some research and let’s try and come up with some ideas. We’ll get back together after lunch.” With that Steve closed his notebook, got up, and walked out of the room.

“Interesting,” I said under my breath. “I guess it’s time to start Googling.”

I walked back over to my laptop and typed in “parallelize cookbook development” in Google. Wow, there was nothing about test-kitchen on there, until I got to http://leopard.in.ua/2013/12/01/chef-and-tdd/ on the second page! It’s 2015, so this post from 2013 has to be out of date. Right? But didn’t Steve mention the -p in stand-up? I read through the page and found out that there was a kitchen test --parallel that must be what he was talking about. Sweet. OK, he isn’t confused and just saying words again.

I hopped over to https://github.com/test-kitchen/ and noticed all the drivers for test-kitchen. There were a ton, ranging from AWS, to HyperV, to OpenStack, and Digital Ocean (DO). This is great, I can run test-kitchen on any cloud I want. I started to play around with the different options and settled on https://github.com/test-kitchen/kitchen-digitalocean as my test version. I’ve always enjoyed using Digital Ocean to do my development and it seemed reasonable. I opened my .kitchen.yml and looked at the configuration.

driver:
  name: vagrant
  network:
    - ["forwarded_port", {guest: 80, host: 8080, auto_correct: true}]
  customize:
    cpus: 4
    memory: 8096

I made the changes to the driver name and added in the Droplet size I wanted:

driver:
  name: digitalocean
driver_config:
  size: 8gb

And then installed the gem:

gem install kitchen-digitalocean
export DIGITALOCEAN_ACCESS_TOKEN="THIS_ISNT_MY_REAL_TOKEN"
export DIGITALOCEAN_SSH_KEY_IDS="8675, 30900"

Then I ran test-kitchen. Nice! It worked. This is great. I spun up both of my test boxes on Digital Ocean, one Ubuntu 14.04 and one CentOS 7 at the same time and verified them both. I got up and walked over to Steve’s office. “Hey Steve, I think I got an answer for you from stand-up,” I said.

“Oh yeah? That was fast,” he said looking up from his laptop. “What is it?”

“I hooked up test-kitchen to Digital Ocean and ran a test command on both systems at the same time,” I said, with confidence.

“Ha! That’s awesome. I had no idea you could use other drivers with test-kitchen, I thought it was all local development.”

“Yeah I learned that after looking at the GitHub test-kitchen org, turns out it was pretty easy to set up.”

“Hang on, Digital Ocean costs money though right?”

“Yeah, it’s only pennies though, and I charged my own personal account.”

“Ah well if it’s ‘only pennies’ I guess you won’t need to expense it.”

“Ouch, fine. I guess I deserve that. So you think it might be too expensive to run our test suite for every commit?”

“You read my mind. We need something like DO but local. How about OpenStack?”

“HAHAHAAHAHAAHAHAHAHAHAH” I actually started crying from the tears of laughter. “Yeah right, I don’t have enough time to set up an OpenStack cloud. OpenStack has only gotten worse since they started the project.”

Steve took a moment to let me compose myself, then said plainly, “You should give it a shot, I heard JJ, the OpenStack Chef guy, mention something called an ‘OpenStack-model-t.’ It’s a project Chef was working on to help people build basic OpenStack clouds. Any chance there’s a kitchen-openstack driver for it like there was for Digital Ocean?”

“Actually, yeah there is. Ah, OK I see where you’re going with this, I’ll report back what I find out about these two projects.” I turned around and walked out of his office.

I sat down at my laptop, opened up Chrome, and typed in: openstack-model-t. The first hit was: https://github.com/chef-partners/openstack-model-t. One of the first things that caught my eye was: “The Customer Can Have Any Color He Wants So Long As It’s Black”. Henry Ford. Funny, real funny Chef. I started looking around. It seemed that the cookbook was pretty straight forward. There was even a .kithen.yml file in the repo, so I figured I’d checkout the repo and give it a shot.

cd ~
mkdir openstack-stuff
cd openstack-stuff
git clone https://github.com/chef-partners/openstack-model-t.git
cd openstack-model-t
chef exec kitchen verify

My laptop fans started to spin and my MacBookPro started to heat up, yep, I was building an OpenStack cloud on my laptop. After about 20 minutes, I came back to my laptop and I saw:

         Process "nova-scheduler"
           should be running
           user
             should eq "nova"
         Process "neutron-l3-agent"
           should be running
           user
             should eq "neutron"
         Process "neutron-dhcp-agent"
           should be running
           user
             should eq "neutron"
         Process "neutron-metadata-agent"
           should be running
           user
             should eq "neutron"
         Process "neutron-linuxbridge-agent"
           should be running
           user
             should eq "neutron"

       Finished in 1.52 seconds (files took 0.41367 seconds to load)
       67 examples, 0 failures

       Finished verifying <default-ubuntu-1404> (0m18.47s).
-----> Kitchen is finished. (16m24.13s)

Wow, a fully tested and verified All-in-One OpenStack cloud in one command! Let’s see if I can spin up a VM. Reading the README, it seems I have to run a script when I first ssh into the box to make sure I have an initial network and “CirrOS” image on the machine. OK, so be it.

chef exec kitchen login
sudo su -
bash build_neutron_networks.sh

Now the README tells me I can look at the URL in my web browser, with https://127.0.0.1:8443/horizon and I should see the OpenStack login. I opened up Chrome again and typed it in. Sweet, it worked. Using the demo username and password of mypass, I was able to login without a hitch. I clicked the Instance button on the left-hand side and then “Launch Instance”. Heavens to Betsy, it actually worked!

I looked around, I didn’t believe it. A successful OpenStack build out of the box? What kind of black magic was this? This is bonkers. I stood up from my laptop and took a walk. This opened up a huge opportunity for my team and I needed to clear my head.

As I walked around my office I saw one of our IT team members carrying a new laptop that he had provisioned for someone on another team. He was going the same direction I was, so I figured I’d ask, “Hey Billy, quick question what do you do with the old desktops that these laptops are going to replace?”

I guess I caught him off guard because he turned and looked at me with confused and concerned eyes, “Oh, hey, honestly, nothing. They have already depreciated in value so they just sit in the IT room until we get Goodwill to come ’round and we write them off as a donation. Nothing too exciting.”

“Interesting, any chance I can get…three of them, I’d like to try something out.”

“Sure, no problem, come ’round in a bit and I’ll get you what you need.”

“Awesome, see you then.” And I continued walking.

About twenty or thirty minutes later I walked up to the IT room and Billy was looking at Reddit. No real surprise there, 60% of IT work is done or is linked to from Reddit and yes you can quote me on that. “Hey Billy, I’m here, can you hook me up?”

“Sure, no problem, take what you need; just write it down on that clipboard.”

“Can’t I email it to you?”

“Meh, you could, but I’d prefer to see you write down what you take when you take it.”

“Fair enough,” I said, as I picked up one of the desktops. “Any chance you have a spare switch or two?”

“Probably, just be sure to put it down on that clipboard if you find one.”

“Cool, thanks again.”

“No problem”, he said as he turned back to looking at his laptop.

I pulled together what I needed and took it all back to my desk. The README for the model-t had a controller node that needed 3 NICs in it and the compute nodes only needed 2 if I wasn’t going to use a storage network. I thought about it for a few moments and realized that nope, I don’t need a storage network for running a test-kitchen OpenStack cloud, so 2 NICs would be fine.

I powered on the controller node, it had Windows 7 on it, which meant I had to re-image these machines. I downloaded Ubuntu 14.04, created a USB boot disk, and started the installation on the machines. I tried doing it in parallel but ended up confusing myself; so I did it serially. I rebooted the controller node and started the installation. I did a basic install, naming the machine Controller, original I know, and connected it to my lab network. I was able to ping 8.8.8.8 with only my management network plugged in so I felt like a was making some progress. I repeated the process with the two Compute machines naming them Compute1 and, you guessed it, Compute2 and then decided to break for lunch. Over lunch I talked to some of my coworkers about what I was doing and all-in-all there was a pretty positive response. I did get a couple giggles and shakes of their heads about OpenStack, but I expected that.

I walked back to my desk, looked at the beginning of what I was hoping to be a real OpenStack cloud and was pretty proud of my work. I remembered then that Steve wanted to have a sync up after lunch, so I headed over to the dreaded stand-up conference room. As I walked in Steve was just starting.

“So it seems we have a couple options on the table. Most of you figured out that you could run test-kitchen with different drivers giving you access to more compute resources. That’s good. Some of you didn’t come to me with anything so I have to assume you either got caught up with something else, or didn’t bother looking.”

After that inspiring talk, I spoke up, “Hey Steve, I’ve made some pretty impressive progress with the OpenStack-model-t cookbook, I’m going to go heads down on it for the rest of the day to see if I can get this done by EOB.”

Steve looked at me, smiled, then looked at the rest of the room and said, “See that’s what I want, someone to run with my assignments. OK, let’s get back to work and let’s see where we are at stand-up tomorrow.” We all started filing out of the conference room, all of us realizing at that moment how much of a waste of our time our sync meetings were.

I sat down in front of my laptop and started to think of what I had to do next. I put a small plan together:

  1. Get a hosted Chef instance, the 5 free nodes they give me ought to be enough to get this proof of concept built.
  2. Get the model-t cookbook up on the hosted instance.
  3. Figure out what I need in the run_list for each of the machines.
  4. Converge and see my cloud come to life.

First is always first, right? So I went to https://manage.chef.io and got myself an instance of hosted Chef. It was pretty easy I just created a new org, baller-model-t and then pulled down the getting_started kit. I did a chef exec knife status to confirm it was working and it was. Awesome, step 1 complete. Second, I uploaded the openstack-model-t cookbook to my hosted instance. It complained about dependencies because I had forgotten to do the Berkshelf stuff. So I did the following:

cd openstack-model-t
chef berks install
chef berks upload
cd ..
chef exec knife cookbook upload openstack-model-t -o .

Sweet, success. OK, as it looked in the cookbook the All-in-One test was sent from the default.rb recipe. That’s good, that’s my controller node. Now the question is what’s in my compute node run_list? Well it looks like compute_node.rb is the wrapper for just a compute node. Great, I’m almost done. Now to get the converge to happen. I still need to get these boxes to check into the hosted Chef instance, so I decided to use knife bootstrap to do it in one shot. These were the commands I used:

chef exec knife bootstrap controller -x ubuntu --sudo -r 'openstack-model-t::default'
chef exec knife bootstrap compute1 -x ubuntu --sudo -r 'openstack-model-t::compute_node'
chef exec knife bootstrap compute2 -x ubuntu --sudo -r 'openstack-model-t::compute_node'

I went to https://controller/horizon and I couldn’t believe what I saw. I logged in with admin/mypass and I saw 3 hypervisors and an empty OpenStack cloud ready to go. I ssh’d into the controller node, ran the bash build_neutron_networks.sh, and saw it come to life. (NOTE FROM MY FUTURE SELF: If I remember correctly, I had to change some of the floating-ip options around on my second build of this, but still, this was amazing.)

I spun up a CirrOS image, then a second, and a third. I ssh’d into each of them and make sure I could ping 8.8.8.8. It worked! I had a running multi-node horizontally scalable cloud on my desk. I looked at the clock and it was only about an hour ‘til quitting time. I searched for some OpenStack cloud images and found Ubuntu’s and CentOS’ and injected them into my cloud. I leaned back for a moment and thought about what I had accomplished. I had to smile. As someone who has always been interested but scared to try building out OpenStack this seemed like dream come true.

For a couple moments, I thought about going home, then I realized that I could finish the project off if I just got kitchen-openstack running. So I went through the same steps that I did with Digital Ocean with the kitchen-openstack driver. I ran my parallel tests and successfully spun up a test-kitchen run.

I couldn’t be more proud of my accomplishments for today. For the first time in a while I’m going to be excited for stand-up tomorrow when I can show this off to everyone.

2 comments :

TWForeman said...

Nice article, but none of the URLs are actual clickable links...

Rob Nelson said...

I laughed, I cried, I left a comment :) Awesome, quotable article to start of SysAdvent, thank you!