December 6, 2009

Day 6 - Storage: ATA over Ethernet

This article was written by Bob Feldbauer.

Continuing from yesterday's storage discussion on DRBD, let's introduce another option in the storage space.

When your environment has grown beyond direct attached storage (internal drives, or an external drive array) and basic network attached storage (NAS), the next step is generally to consider implementing a storage area network (SAN); however, the cost and complexity of existing Fibre Channel and iSCSI SAN solutions can be daunting. Fortunately, ATA over Ethernet (AoE) can often be used as a simpler, lower-cost alternative.

Before diving into AoE, you should be aware of its limitations. The major idiosyncrasy of AoE is that it does not use TCP/IP; therefore, it is not routable. Also, compared to iSCSI, it lacks encryption and user-level access control. AoE truly shines when you simply need storage over a local network, and can limit access by control over physical ports on a switch, VLANs, etc.

AoE is supported on a wide variety of operating systems, including Linux, Windows, FreebSD, VMWare ESX, Solaris, Mac OS X, and OpenBSD. Like Fibre Channel, iSCSI, and other storage protocols, the AoE protocol implements an initiator-target architecture - the initiator sends commands, and the target receives them. On Linux, various pieces of software are used to provide initiator and target functionality. Four independently developed AoE targets exist for Linux: qaoed, ggaoed, kvblade, and vblade. Since vblade is part of aoetools, it generally seems to have the most active development community; therefore, we'll use that for our example setup. Depending on the version of AoE you use, it can be used with Linux kernels 2.4.x or 2.6.x; however, it is best used with kernel 2.6.14 or higher and newer versions of aoetools.

For our example configuration, we'll be setting up AoE on Debian Linux (Lenny), and using LVM2. The AoE module should be included if you're using a standard Debian kernel, but let's check using:

grep ATA_OVER /boot/config-`uname -r`
Which should return:
CONFIG_ATA_OVER_ETH=m

That means AoE is supported through a kernel module. If you aren't already using AoE, the module is probably not loaded yet. Let's load it, and add it to /etc/modules so it is automatically loaded at when the system is started in the future:

modprobe aoe
echo "aoe" >> /etc/modules
On Debian, we'll use apt to install the remaining necessary components:
apt-get update
apt-get install lvm2 vblade vblade-persist aoetools

Note that aoetools includes the following tools:

aoecfg manipulate AoE configuration strings
aoe-discover trigger discovery of AoE devices
aoe-flush flush down devices out of the AoE driver
aoe-interfaces restrict network interfaces used for AoE
aoe-mkdevs create character and block device files
aoe-mkshelf create block device files for one shelf address
aoeping simple userland communication with AoE devices
aoe-revalidate revalidate the disk size of an AoE device
aoe-stat print status information for AoE devices
aoe-version print AoE-related software version information
coraid-update upload an update file to a Coraid appliance

For our example, we'll use LVM2 and AoE to allocate space and make it available over the network, using two disks. For a recent project, I used 13 drives on a hardware RAID controller and divided them into two RAID6 arrays of 6 (4TB) and 7 (5TB) drives.

Assuming the disks are the second and third disks on a Linux system, configure LVM2 to recognize the physical volumes:

pvcreate /dev/sdb
pvcreate /dev/sdc

Create two LVM2 volume groups on the physical volumes - for our example, let's call them content and backups:

vgcreate content /dev/sdb
vgcreate backups /dev/sdc

Then create two 1TB LVM2 logical partitions (/dev/content/server1, /dev/backups/server1):

lvcreate -L 1TB -n server1 content
lvcreate -L 1TB -n server1 backups

Although there are many different types of filesystems available under Linux, for familiarity and to avoid unnecessary complexity in our example, we'll use ext3:

mkfs.ext3 /dev/content/server1
mkfs.ext3 /dev/backups/server1

Now that we have our drives configured with LVM2 and formatted with a usable filesystem, we can setup the AoE target using vblade-persist:

vblade-persist setup 0 1 eth0 /dev/content/server1
vblade-persist setup 0 2 eth0 /dev/backups/server1
vblade-persist start 0 1
vblade-persist start 0 2

To mount our newly created AoE devices on a remote server, run the following on a second server:

modprobe aoe
apt-get install aoetools
aoe-discover
aoe-stat    # should show the available AoE exports
mkdir /mountpoint
# replace e0.1 with the appropriate device from aoe-stat
mount /dev/etherd/e0.1 /mountpoint 

Your options in the storage are many. This introduction should give you the necessary tools to decide if AoE is the storage solution that meets your requirements.

5 comments:

  1. Awesome writeup, Bob. One of the most attractive parts of AoE for me is the killer low-price of the AoE disk arrays from coraid. The Etherdrive arrays have amazing storage for unbelievably low prices, at least compared to enterprise SAN offerings from anyone else.

    Thanks for writing this. Really good stuff!

    ReplyDelete
  2. I'm glad you liked the article, Matt. Thanks for the feedback! I haven't used any of Coraid's products, but they do seem very reasonably priced. My experience with AoE has been with generic server hardware, and it has worked well for my purposes.

    I'm currently mostly it to provide centralized storage for Linux vservers, and database backups. Being able to move a vserver to a new server by detaching/attaching its mounted partition to a new physical server is great.

    ReplyDelete
  3. Nice writeup. I'd love to hear from anyone using AoE with VMWare in production, especially compared to using iSCSI and/or NFS.

    I'm not sure I 100% agree that AoE is less costly / easier than iSCSI. At a minimum iSCSI can work with everything a network already has and doesn't require special HBAs as is the case of AoE + VMWare.

    That said it is unfortunate AoE hasn't taken broader adoption. I think storage vendors are really banking on a future of converged (IP) networks running over 10Gbe (and higher) links - thus the lack of attention for non-IP options.

    ReplyDelete
  4. Jeff: I'm guessing you're talking about Coraid's EtherDrive HBA for VMWare ESX; however, that doesn't seem to be a requirement if you use the AoE driver for ESX?

    Check out...

    http://www.vmworld.com/thread/1938

    "The ESX AoE driver interacts with the VMkernel itself allowing imported AoE targets to be used as block devices for ESX itself. This means one can format it as VMFS and store the virtual machines on AoE targets."

    Having said that, I honestly don't know much about AoE and VMWare. I generally use Linux vserver and Xen, with the occasional (painful) use of Hyper-V.

    ReplyDelete
  5. A primary usage of iSCSI is instant disk space. With no server downtime or re-boot required you install the initiator driver on the Windows machine, map the storage to the disk then allocate the desired disk capacity and format it under disk manager.

    ReplyDelete