Subjects ->
Computer Networks ->
Lectures ->
Lecture #07
Lecture 7: Applications #3.2: HTTP
HyperText Transfer Protocol, v1.0
The original (0.9) version of HTTP was not in use for very long, being
soon replaced by version 1.0. In its most basic form, a v1.0
GET
request looks like:
GET /index.html HTTP/1.0<newline><newline>
The response from the server consists of a status line, then a number
of plain text headers, followed by a blank line and then the requested
data object. It's clearly a very similar format to an RFC822 email
message:
GET /index.html HTTP/1.0
HTTP/1.0 200 OK
Server: Netscape-Enterprise/3.5.1C
Date: Tue, 20 Mar 2001 11:48:39 GMT
Content-type: text/html
Last-modified: Fri, 16 Mar 2001 02:22:52 GMT
Content-length: 11378
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
........(etc)
A Tour of the HTTP/1.0 Response Headers
HTTP/1.0 200 OK
- An ordinary plain text status line -- note the "200-series"
status.
Server: Netscape-Enterprise/3.5.1C
Date: Tue, 20 Mar 2001 11:48:39 GMT
Last-modified: Fri, 16 Mar 2001 02:22:52 GMT
- Various entertaining bits of information. The
"
Last-modified:
" header can be quite useful,
see the HTTP/1.0 "HEAD
"
request, later.
Content-length: 11378
Content-type: text/html
- These two headers follow (approximately) the MIME convention
for identifying the type of data contained in the "body" of the
response -- in this case, plain text which should be interpreted as
HTML by the browser. Note that MIME email-header
"
Content-Encoding:
" (used in
MIME-encoded email messages) is not normally used in HTTP
because the protocol is designed to handle "8-bit" data. That is,
any data at all can be sent after the blank line which signifies
the end of the response headers.
More on the GET
Request
HTTP/1.0 permits the GET
request (and other HTTP
request types, see later) to additionally send
a series of optional Request Headers along with the
request. For example, here's a typical request to ironbark, snarfed
from the local network:
GET /index.html HTTP/1.0
User-Agent: Mozilla/3.0 (X11; IRIX 5.3 IP12)
Host: ironbark.bendigo.latrobe.edu.au
Accept: image/gif, image/jpeg, */*
Referer: http://ironbark.bendigo.latrobe.edu.au/index.html
The request headers are terminated with a blank line -- hence the need
for two newlines, as seen in the first slide of today's lecture. It's also possible
for the request to contain a "message body", just like a response
message -- we defer discussion of this until later in the unit.
Perhaps the most interesting optional request header is
"If-modified-since:
", which takes an HTTP standard
GMT time/date string as its value. If the requested page has not, in
fact, been modified in the specified period, it won't be returned --
instead, a "304 Not Modified
" response is sent.
This is called a Conditional-GET and is very useful in
support of caching, of which more later.
Other HTTP/1.0 Request Types
The HTTP 1.0 protocol is formally specified in terms of
"methods," rather than simple commands. The available
methods are:
- GET
- We've already seen this "request to read a generalised object".
The object can be a Web "page" (HTML document), an image, a sound
sample or a wide range of other types.
- HEAD
- A request to return the response header only, without the
content. This can contain much useful information about the
requested entity, without the need to actually load it -- eg, how
big it is.
- POST
- Originally defined as a request to "append to a named resource"
(eg, a Web page), this method is extensively used in
CGI-based systems, see later.
- PUT
- Request to store a Web page. Has only ever been used
experimentally.
- DELETE
- Delete the Web page specified. I'm unaware of this having ever
being used, so we can ignore it.
- LINK
- Connect two existing resources. Likewise, never used.
- UNLINK
- Breaks an existing connection between two resources. Not used.
HTTP/1.0 Performance Issues
HTTP/1.0 has been criticised
for poor performance and lack of scalability. There are several aspects
to this:
- HTTP/1.0 opens a new TCP connection for every single transaction.
For example, if a Web page contains 10
<img ...>
images, HTTP/1.0 must open a total of 11 TCP connections -- one for
the original page, and 10 for the images. Problems which arise from
this include:
- There can be a moderately long overhead in initial connection
establishment due to round-trip delays.
- TCP initiates connections using the so-called "slow start"
algorithm. This is necessary for proper operation, but is very
inefficient for short transfers -- TCP typically takes 10 to
20KB of transferred data to get "up to full speed". Both of
these can cause HTTP/1.0 browsers to seem really slow.
- TCP is required to maintain "state" information about closed
connections for 240 seconds, to ensure that stray packets from old
connections won't be interpreted as valid data by a later
connection. When a server is handling a large number of
connections, this can require huge buffer space, and is very
inefficient.
- HTTP/1.0 has very limited support for caching.
Because of these aspects, HTTP 1.0 is gradually being replaced by
HTTP 1.1.
HTTP/1.1 Basics
HTTP/1.1 (rfc2616) is now
in widespread use. It extends the older protocol in a number of areas,
notably persistent connections/pipelining and
support for caching.
To implement Persisent Connections, HTTP/1.1
introduced a new request (and also response) header called
"Connection:
". This can take two values:
"close
" (which means that this is
not a persistent connection) and
"keep-alive
", which means that the TCP connection is
held until either side sends a
"Connection: close
" header, indicating that it
wishes to terminate.
The browser can utilise a persistent connection by sending multiple
requests over the connection without stopping and waiting for each them
to be satisfied before sending the next -- the reponses are "in the
pipeline". Similarly, the server can respond with responses sent one
after another another. This is possible because each request can be
unambiguously identified, as can the responses, using the
"Content-length:
" headers. The huge wins here,
obviously, are that there's no delay opening multiple TCP connections,
and the slow-start algorithm has time to get up to full speed.
Support for Caching in HTTP/1.1
The World Wide Web has been spectacularly successful -- so successful
that a huge proportion of Internet traffic is HTTP, ie Web pages and
related objects (eg, images). Caching is a technique
whereby copies of popular objects are kept in strategic locations, and
supplied in lieu of the originals, saving huge amounts of
traffic on the "backbone networks".
The Conditional-GET operation seen earlier allows
support for caching at the browser level -- that is,
the browser can keep a local copy of an object and check if it's up to
date before displaying it.
HTTP/1.1 adds support for proxy server caches. A proxy
server is an HTTP server which fetches Web objects (pages, images, etc)
on behalf of its clients. Requested objects are always specified as
full URLs in HTTP/1.1, so the first line of a GET
request now looks like:
GET http://www.bendigo.latrobe.edu.au/index.html HTTP/1.1
Implementing Caching
It's obvious that the proxy server can check its local cache to see if
a requested object has recently been fetched. It is slightly more
subtle to discover if it's actually the same object.
HTTP/1.1 adds some new response headers to ensure that caching works
correctly:
Expires: Tue, 20 Mar 2001 02:22:52 GMT
Pragma: no-cache
- An object can be marked as having a limited lifetime, and once
the specified date/time has passed must be re-fetched from the
originating server. Also, an object can be flagged as
"un-cachable". These were both present in HTTP/1.0.
Etag: "8802-2c72-3ab178fc"
- This is the "Entity Tag", and is used to discover, with
somewhat greater certainty, if the object (or entity) in the local
cache is exactly the same object (eg, isn't
different in any way) as the object stored on the remote server.
The client can use an
If-None-Match: "8802-2c72-3ab178fc"
header with a GET request to specify the version of the object
which it already has. This is a significant improvement over the
HTTP/1.0 "Conditional-GET". Note that HTTP/1.1 has a large number
of other operations which can be used with Entity Tags.
You can discover lots more about HTTP at:
http://www.w3.org/pub/WWW/Protocols/Specs.html
The tutorial for this lecture is
Tutorial #07.
[Previous Lecture]
[Lecture Index]
[Next Lecture]
Copyright © 2001 by
Philip Scott,
La Trobe University.