Subjects ->
Computer Networks ->
Lectures ->
Lecture #10
Lecture 10: The Domain Name System (DNS)
IP Addresses Revisited
In lecture 2 we first introduced
the concept of the IP Address of a computer. We stated
that it's simply another way of identifying a specific machine -- we
could equally use the machine's name to identify it.
These two different ways of referring to a system come from two
semi-conflicting requirements:
- The hierarchical domain-based naming system used in the Internet to
identify machines is designed for human usage --
in general, we find this system incredibly convenient to use,
having a high correlation with how the Real World™
is organised.We are very comfortable with the use of names like
amazon.com
and
ironbark.bendigo.latrobe.edu.au
.
- Actual packet delivery in the Internet is based on
a separate, fixed-length (four byte) numeric IP
Address. For example, the IP address of
ironbark
is 149.144.21.60
.
IP addresses are used by the the routers which implement the
Internet's delivery service, the Internet
Protocol, or IP, to route packets of data through the
Internet to their destination.
The Internet Domain Name System (DNS) is used to
provide a mapping between these two alternative identification
approaches: the human-oriented domain name and the delivery-oriented IP
address. Its most common usage is to look up the IP
address corresponding to a known domain name.
The DNS Hierarchy
The DNS Namespace is based on a "tree" structure, with
a small(ish) number of generic Top Level Domains (eg,
.com
, .edu
,
.org
) and a large number of country-based domains
(eg .au
, .my
,
.uk
). Each TLD supports a group of "second-level"
domains, and so on, all the way down to individual hosts.
- A domain name is a dotted sequence describing a
path through the name hierarchy from the root, maybe with a
trailing dot, thus:
bindi.bendigo.latrobe.edu.au.
- An individual name component must be less than 63 characters, must
begin with a letter, etc...
- Upper and lowercase may be used, although name lookups are case
insensitive by definition.
Resource Records
Each domain name has one or more Resource Records
(RRs) associated with it. Resource records are 5-tuples:
Domain_name TTL Type Class Value
- Domain_name
- the name of the domain to which this RR applies.
- TTL
- the Time To Live of this RR. When this RR is
returned as a result of a DNS lookup, the remote host normally
caches the information for efficiency. The TTL is the time, in
seconds, after which the cached information should be regarded
as potentially out of date.
- Type
- there are several types of RR, including:
SOA
- Start Of Authority.
A
- IP address of a host.
NS
- Name Server, etc
- Class
- Always set to "
IN
", for Internet
- Value
- The actual value of this particular RR. Can be, for example, an
IP address, a number, some ASCII text or a combination.
DNS Servers and Resolvers
A nameserver provides domain-name-to-IP-address
mappings (and a few other functions, but "looking up" IP addresses is
the most common) for one or more zones, which are
sub-trees of the domain name space. For example, sheoak
is
a nameserver for the zone bendigo.latrobe.edu.au.
This
means that if I want to look up a particular IP address in that zone, I
can ask sheoak
.
Exactly which server is responsible for a particular zone is specified
in start of authority (SOA) RRs. An SOA RR specifies,
for the particular name server, the zones for which it has authority.
It also has the email address of the site administrator, a unique
serial number and various other bits and pieces.
The DNS system forms a distributed database of domain
information.
A resolver is a library function[1] which queries the nameserver when called
from a user program. It can check the local cache of names and, if
necessary, request a RR from a nameserver (caching the response). In
other words, a resolver is software which asks a nameserver for
information.
[1] Such as is built-in to the Unix library function
gethostbyname(3)
.
Nameserver Queries
The resolver sends a question to a name server, of the
form:
{query domain name, type, class}
The server responds with one or more appropriate RRs. It also sends an
ADDITIONAL INFORMATION section, which contains extra RRs which the
resolver will probably find useful. For example, if a resolver queries
for a particular NS
RR, the server will return it,
plus additional information giving the IP address of the name server
specified in the main body of the reply.
The most common DNS query is of type A, where the
resolver is required to map a domain name to an IP address - that is,
"looking up" an IP address. Some typical type A RRs look like:
ironbark 86400 IN A 149.144.20.200
redgum 86400 IN A 149.144.21.3
bindi 86400 IN A 149.144.20.82
Note that the "domain name" part of these RRs has been omitted (leaving
only the hostnames) for clarity.
Recursive Queries
A host within a specified domain (eg a machine at Bendigo, in domain
bendigo.latrobe.edu.au
) is configured to "know" the
IP address of its local nameserver. What happens when it sends a query
for a non-local name, (eg amazon.com
)? The sequence
of events is something like:
- The local nameserver will forward the query to its "parent"
nameserver -- in our case, the nameserver for domain
latrobe.edu.au
.
- This nameserver, in turn, (usually)
forwards the query recursively up the "tree", to
where a root nameserver will pass it to a
nameserver for the
.com
domain, which will have
the desired name-to-address mapping.
- The result of the query is then passed back through the chain of
nameservers (each of whom will normally cache the
information), finally arriving at the originating host. This
process is called a recursive query.
It's obvious that recursive queries could be quite slow. The DNS
provides a way of "short-circuiting" the whole process. If (for
example) the local nameserver already knows (due to caching) the IP
address of a nameserver for the .com domain in the
above example, it can contact it directly, thus avoiding many recursive
stages. This is called an iterative query. In
practice, most (all?) queries to root nameservers are iterative. Every
nameserver is configured to know the IP address of at least one root
nameserver.
Some DNS Subtleties
- Mail eXchange
- the DNS provides the MX type of RR to discover where email is
to be delivered. An MX RR specifies a primary mailhost, and
lesser preferential hosts where mail for a specified domain is
be delivered. For For example, ironbark has:
ironbark IN MX 10 ironbark
IN MX 20 redgum
IN MX 40 sheoak
- Reverse lookups
- a special domain (
in-addr.arpa
) and address
format is used to map IP addresses to domain names, thus:
60.21.144.149.in-addr.arpa
This is called a PTR RR. Performing reverse lookups is
much more difficult than normal "forward"
address lookups.
- CNAME
- Often a host may be known by several names: names other than
the official host name are called aliases, and a CNAME RR maps
an alias name to a host's "real" name.
- HINFO
- describes some basic information about the type of CPU and the
OS it is running. Rarely kept up-to-date.
DNS Implementation Technicalities
The DNS works as a distributed database because of two fundamental
ideas: replication and caching. We
have already seen how caching works -- at any point in a query, if a
nameserver has a current copy of the desired information, it can supply
it instead of contacting other nameservers.
The DNS requires that all nameservers be replicated at
least once -- that is, for each zone of authority there must be at
least two authoritative nameservers. The rules for
replication of nameservers make for quite entertaining reading...
DNS queries and responses are an excellent example of an application
where the reliable, connection-oriented transport mechanism of TCP is
not necessary, and simply has too much overhead. In fact, queries are
encapsulated in unreliable UDP datagrams, see later.
UDP is a connectionless transport service, with the
same level of reliability as IP packet delivery itself -- in other
words, UDP messages can be lost, delivered out of order and even
duplicated. If a resolver does not receive a reply from a nameserver,
it usually either tries again, or tries the next nameserver for the
same domain.
Finally, although it is beyond the scope of our subject, DNS messages
are NOT simple ASCII strings -- the DNS formats are
quite complex and designed for efficient parsing. It's not trivial
(for obvious reasons) to write a DNS client. In a sense, DNS is not
strictly an application protocol -- it provides support for application
protocols, but isn't one itself.
Extra infomation
Here's a
definitive guide to DNS.
This is a nice
tutorial
on DNS.
Here's a slide show on
DNS.
Here's the bare bones of
another lecture
on DNS, with something of a
Linux emphasis.
Finally, here's a good technical
tutorial from Connect.com.au
on how DNS works in the Real World...
The tutorial for this lecture is
Tutorial #10.
[Previous Lecture]
[Lecture Index]
[Next Lecture]
Copyright © 2005 by
Philip Scott,
La Trobe University.