sysadvent: dns

December 8, 2010

Day 8 - Everything is a DNS Problem

Written by Kris Buytaert (@KrisBuytaert)

Systems break. Whether you like it or not, one day, they will break. Either when they are up and running or when you are building new stuff, you will one day run into problems. Sometimes the error messages will guide you to the solution quickly, but sometimes they give you no pointers at all, and sometimes there are no error messages - just weird behavior.

When that happens, it's time to pull out your troubleshooting skills. And so, you read logfiles, you google, but you find nothing; you lie awake at night trying to figure out what parameter in which config file you forgot.

In the next couple of examples, I'll guide you to some issues I ran into over the past decade. The list is far from exhaustive but it might give you an idea.

Let's start with some trivial stuff.

Who hasn't heard the "When I log on to server X, it takes a while" complaint from a user? However, when you log on to the box, it goes lightning fast. At first, you think it was a temporary glitch, but those 5 users keep complaining. You go to their workstation and, indeed, from their desk things do go a lot slower. Turns out, they are on a newer part of the building that is on the newest subnets in your organization, and for those networks, there are no reverse mappings yet. As when the users log in, the server first tries to figure out who they are, and sometimes it takes more than an acceptable timeout before the lookup has been made.

Reverse DNS lookups causing performance problems are amongst the most common problems around. They happen with databases, regular logins, etc, and sometimes people doing performance comparisons of MySQL vs NoSQL tools fall into the trap , and they end up testing a Failing DNS lookup

The quick fix, adding the hosts to your /etc/hosts file is sometimes the only alternative as you don't always have control over the reverse dns mapping.

Luckily lots of daemons also let you disable the feature, such as the skip_name_resolve entry in your my.cnf or the UseDNS=no stanza in your sshd.conf .

Don't think it's just the regular MySQL and sshd services, there's even web applications that start performing slow because of dns problems, such as one wordpress user figured out. Some applications are slow when dns is misconfigured, but plenty of applications just don't want to launch when they can't figure out where they are running. e.g an old DRBD issue caused drbdadm to crash. The easy ones to detect are the ones that actually tell you they can't lookup localhost, or the node you are starting the application on, performance issues are usually also a good pointer, but plenty of times it just doesn't show.

I've seen dns causing problems across the board: Xen, GFS, DRBD, Oracle and many others, but apart from applications that have problems with misconfigured DNS setups, there's also people who try parsing the output of dig to find out the nameserver by grepping for "SERVER" as in the comment section of the dig output it notes what nameserver it used. Now imagine the output of dig containing any of the root nameservers such as A.ROOT-SERVERS.NET indeed .. the detection will fail

DNS problems can creep up on you in expected ways in every part of your infrastructure, so what can you do to prevent them?

The first and most important problem to solve to ensure that for every part of your network, you have a correct reverse mapping. RFC 1912 clearly points out that "Every Internet-reachable host should have a name." and "Make sure your PTR and A records match. For every IP address, there should be a matching PTR record in the in-addr.arpa domain. If a host is multi-homed, (more than one IP address) make sure that all IP addresses have a corresponding PTR record (not just the first one)."

So, if you have a 172.16 RFC 1918 subnet in your network you want to have a reverse zone that looks like:

more 172.16.0.db $TTL 604800 $ORIGIN 0.16.172.in-addr.arpa. @ IN SOA ns1.yournetwork.org. root.yournetwork.org. ( 2010101501 ; Serial 3600 ; Refresh 3600 ; Retry 2419200 ; Expire 604800 ) ; Negative Cache TTL ; IN NS ns1.yournetwork.org.

1 IN PTR ns1.yournetwork.org. 2 IN PTR zion.yournetwork.org. 3 IN PTR matrix.yournetwork.org.

For public networks addresses, you sometimes have to talk to your upstream vendor for reverse mappings that match your domain. Sometimes they already have a reverse-map.customerof.theirdomain.com, but they usually are happy to make the updates or to delegate the administration whenever possible.

The second problem when relates to updating DNS zonefiles: people often forget to update the serial number of their zonefile. After hours and hours, the newly added host still isn't known on the network or Internet and the zonefile on the primary nameserver is showing it correctly. However, the nameserver and his slave haven't realized there is a new zonefile around yet. Failing to update the serial-number of the zonefile is the default problem that everybody falls in to once in a while. What if you are using a YYYYMMDDID timestamp and you by accident put in a YYYYDDMMID timestamp in place .. chances are you need to wait a whole year before you can continue to use your old scheme, or you can add 2147483647 to the now-incorrect value as documented here.

Before I let you guys go, I do have to point you to a tool you can't live without: http://intodns.com/. This is an online service that will check your public dns config, and point out different improvements you can make. Try it! It's worth your time.

By now, you must realize that everything is a funky DNS problem, and as @patrickdebois realized, DNS stands for Devops Need Sushi, but that's a different post :)

December 7, 2009

Day 7 - Active Directory naming is easy, right?

This article was written by Sam Cogan

Active Directory naming is easy, right? You've just got to pick a name for your domain; any name will do won't it?

This is the view many newcomers to Active Directory (AD) take, and it's the view I had when I was first introduced to AD. It often works, even for a while. Then, a few days or weeks down the line, you start to notice problems, or with greater understanding of how AD works, you realise that perhaps there was a better name. By this time, it is too late - the name is set in stone. Sure, you could rename it with the domain name rename tool (rendom), but it's likely to cause problems. Let's look at why AD naming can be problematic and what we can do to make things better.

Microsoft's decision to tie Active Directory closely to DNS, while making sense, has caused a lot of problems for inexperienced sysadmins. One of the most common problems I hear from new sysadmins working with AD for the first time is, "I setup Active Directory with our company's external domain name, but now no-one can get to the company website or ftp site!"

Why does this happen? If your AD domain is example.com, AD will answer DNS queries for that domain, which likely fails to serve external services properly, such as your corp website at www.example.com.

Using your company's external domain name for DNS seems like the perfect idea at first. Limited understanding of how AD interacts with DNS has lead to a decision that may create problems and administrative overhead. Yes, there are potential solutions to this problem: implementing split brain (aka split view) DNS, changing your AD name, or installing IIS on every domain controller to perform redirects. But it's a scary prospect for a new sysadmin who's boss is about to explode because he can't get to their website and is often enough to put them off AD for good. So yes, you can use your external domain name for AD, but in my opinion, you shouldn't. It causes problems, so why give yourself the headache?

I've found there are a number of excuses people give for using the external domain name for AD, and I've used some of them myself. For example, "We had to use our external domain because we want to use that domain name for our UPN suffix". Truthfully, you can have as many UPN suffixes as you like by adding them in the Domains and Trusts MMC. Inexperience with AD may drive assumptions as above, but after digging into it, you will find that your assumptions may not be correct about what you think you need to use as your AD domain.

So, what AD domain name should we use? There are two common schools of thought on this subject: either something like example.local, or use a subdomain of your external domain (like corp.example.com, if you own example.com). Alternately, you can use a different external domain name, but this is not recommended for the general case.

The use of the .local extension came about because it allowed the separation of the AD domain from the registered internet domain (ie; example.com) without having to buy another domain. It's also easy to get an SSL certificate for a .local domain from a trusted SSL vendor, should you need one for internal resources. The alternative is to chose a real, unowned TLD to build your domain on, but you have the obvious risk of that domain being owned by someone else.

Maybe we have a good domain decision, now, with no extra cost? Maybe not! There are problems with using the .local domain. First, it's not a reserved TLD. While it's unlikely, it's possible that IANA could choose to delegate this TLD, opening it up for registration and causing potential name conflicts. Second, the use of .local can also cause problems if you have Apple computers on the network, as it is used by the Bonjour service. Finally, because .local domains are not controlled by a registrar, someone else could be using the same domain name in another AD instance. This problem will bite you when you need to establish trusts or merge domains with another AD instance - if both of you are using example.local you will have conflicts.

Despite these problems, the use of .local is still popular especially in small companies. Microsoft's Small Business server even suggests using this when using its configuration wizard to create an AD domain.

Besides naming with .local, you could choose the name as a subdomain of your external domain, such as ad.example.com, or buy an additional domain for AD only, such as examplecorp.com. Using something like ad.example.com or corp.example.com is pretty common today; Microsoft also recommends this. This is easy and ensures ownership of that domain (unless you forget to renew example.com). Using this method means that your AD DNS server is only responsible for this subdomain and will happily forward on requests for your external websites to servers.

AD naming seems easy, but as we've shown above, there are important considerations when choosing a domain name for your Active Directory domain. This advice, "be careful about seemingly-simple decisions," carries to many other spaces than AD.

Setting up an AD infrastructure should be planned carefully. The domain name choice should be a part of this planning. Consider how your network will be used, and how it will grow over time. If you don't own a public domain name, is it worth purchasing one now so you can ensure your AD domain name is reserved, even if you never use it on the internet. If you do have an external domain name already, try and keep your internal and external DNS separate, you'll appreciate it in the long run. Finally, never use a public domain name that you don't own, you never know who might snap it up and cause you problems!

December 8, 2010

Day 8 - Everything is a DNS Problem

December 7, 2009

Day 7 - Active Directory naming is easy, right?

What is sysadvent?

Blog Archive

December 8, 2010

Day 8 - Everything is a DNS Problem

December 7, 2009

Day 7 - Active Directory naming is easy, right?

What is sysadvent?

Subscribe

Blog Archive