Subjects ->
Computer Networks ->
Lectures ->
Lecture #7
Lecture 7: Internet Applications #3.3: HTTP/1.1
A nice short lecture this time -- we tie up a few last topics in HTTP.
HTTP/1.0 Performance Issues
HTTP/1.0 has been criticised
for poor performance and lack of scalability. There are several aspects
to this:
- HTTP/1.0 opens a new TCP connection for every single transaction.
For example, if a Web page contains 10
<img ...>
images, HTTP/1.0 must open a total of 11 TCP connections -- one for
the original page, and 10 for the images. Problems which arise from
this include:
- There can be a moderately long overhead in initial connection
establishment due to round-trip delays.
- TCP initiates connections using the so-called "slow start"
algorithm. This is necessary for proper operation, but is very
inefficient for short transfers -- TCP typically takes 10 to
20KB of transferred data to get "up to full speed". Both of
these can cause HTTP/1.0 browsers to seem really slow.
- TCP is required to maintain "state" information about closed
connections for 240 seconds, to ensure that stray packets from old
connections won't be interpreted as valid data by a later
connection. When a server is handling a large number of
connections, this can require huge buffer space, and is very
inefficient.
- HTTP/1.0 has limited support for caching.
Because of these aspects, HTTP 1.0 is gradually being replaced by
HTTP/1.1.
(rfc2616).
HTTP/1.1 Basics
HTTP/1.1 (rfc2616) is now
in widespread use. It extends the older protocol in a number of areas,
notably persistent connections/pipelining and
support for caching.
To implement Persisent Connections, HTTP/1.1
introduced a new request (and also response) header called
"Connection:
". This can take two values:
"close
" (which means that this is
not a persistent connection) and
"keep-alive
", which means that the TCP connection is
held until either side sends a
"Connection: close
" header, indicating that it
wishes to terminate.
The browser can utilise a persistent connection by sending multiple
requests over the connection without stopping and waiting for each them
to be satisfied before sending the next -- the reponses are "in the
pipeline". Similarly, the server can respond with responses sent one
after another another. This is possible because each request can be
unambiguously identified, as can the responses, using the
"Content-length:
" headers. The huge wins here,
obviously, are that there's no delay opening multiple TCP connections,
and the slow-start algorithm has time to get up to full speed.
Web Caching
The World Wide Web has been spectacularly successful -- so successful
that a huge proportion of Internet traffic is HTTP, ie Web pages and
related objects such as images. Caching is a technique
whereby copies of popular objects are kept in strategic locations, and
supplied in lieu of the originals, saving huge amounts of
traffic on the "backbone networks".
The Conditional-GET operation seen earlier allows
support for caching at the browser level -- that is,
the browser can keep a local copy of an object and check if it's up to
date before displaying it. Two additional features of HTTP/1.0 were:
- The
Expires:
- response header was used to indicate that an entity had a
limited (specified) "lifetime". This permits finer control over
the Conditional-GET operation. It takes an Internet-standard
date/time string as its value.
- The
Pragma: no-cache
- response header has an obvious meaning: this entity should
never be stored in a cache.
- Note: the (non-standard)
Refresh:
- response header can be used (in some browsers) to force a reload
of an entity.
Additionally, HTML "<META HTTP-EQUIV=...
" tags
can include "equivalent" response information in the
<HEAD>
section of an HTML document. The
browser may regard this as being equivalent to the
corresponding HTTP response header.
Proxy Caches
A proxy server is an HTTP server which fetches Web objects (pages,
images, etc) on behalf of its clients. Proxies normally cache all
"cacheable" reponses, so that if an entity is stored locally, it is
returned instead of sending a request to the originating server. Such
shared caches can significantly reduce an
organisation's "download volume", as well as give significant
performance improvements to the end-user.
Requests to a proxy server are always specified as full URLs, so the
first line of a typical GET
request now looks like:
GET http://www.bendigo.latrobe.edu.au/index.html HTTP/1.0
....other request headers...<newline><newline>
Whilst proxy servers (and caches) were described in HTTP/1.0, the rules
as to how caching should be controlled were unspecified.
Cache Control Mechanisms in HTTP/1.1
HTTP/1.1 introduced a new Cache-Control:
header
which significantly improved the operation of both private (browser)
and shared (proxy) caches. This response header is complex: it has
many, many possible combinations of value. Some common examples include:
Cache-Control: public
- This entity is always cacheable, even in circumstances where it
may not be obvious (eg, in response to a request with an
Authorization:
header.
Cache-Control: private
- The reponse is not to be cached in proxy
caches, and is intended for the use of the end-user alone. The
response may be cached at the end-user
browser.
Cache-Control: no-cache
- Obvious. Don't cache this reponse anywhere. The
no-store
directive is even more
restrictive.
Cache-Control: max-age=3600
- Specifies a time, in seconds, after which the entity becomes
"stale". The
s-maxage
variant specifically
refers to proxy (shared) caches. Both of these are commonly
combined with re-validation options, to give (for example):
Cache-Control: max-age=3600, must-revalidate
- After 3600 seconds, the freshness of the entity must be checked
at the originating server.
Entity Tags in HTTP/1.1
The "Entity Tag" is new in HTTP/1.1 and is used to indicate that two
(perhaps apparently unrelated) resources are in fact the same. For
example, requests for each of the two Web pages:
http://ironbark.bendigo.latrobe.edu.au/subjects/int21cn/news.html
http://ironbark.bendigo.latrobe.edu.au/subjects/int31bcn/news.html
Both return the same Entity Tag header:
ETag: "1cc30e3-88e-404e6d9b"
The client can use an
If-None-Match: "1cc30e3-88e-404e6d9b"
request header with a GET request to specify the version of the object
which it already has. This is a significant improvement over the
HTTP/1.0 "Conditional-GET" -- although not all entities are (by default)
generated with Entity Tags.
You can discover lots more about HTTP/1.1 at:
http://www.w3.org/pub/WWW/Protocols/Specs.html
The tutorial for this lecture is
Tutorial #7.
[Previous Lecture]
[Lecture Index]
[Next Lecture]
Copyright © 2004 by
Philip Scott,
La Trobe University.