Subjects ->
Computer Networks ->
Lectures ->
Lecture #7
Lecture 7: Internet Applications #3.3: HTTP/1.1
A nice short lecture this time -- we tie up a few last topics in HTTP.
HTTP/1.0 Performance Issues
HTTP/1.0 has been criticised
for poor performance and lack of scalability. There are several aspects
to this:
- HTTP/1.0 opens a new TCP connection for every single transaction.
For example, if a Web page contains 10
<img ...>
images, HTTP/1.0 must open a total of 11 TCP connections -- one for
the original page, and 10 for the images. Problems which arise from
this include:
- There can be a moderately long overhead in initial connection
establishment due to round-trip delays.
- TCP initiates connections using the so-called "slow start"
algorithm. This is necessary for proper operation, but is very
inefficient for short transfers -- TCP typically takes 10 to
20KB of transferred data to get "up to full speed". Both of
these can cause HTTP/1.0 browsers to seem really slow.
- TCP is required to maintain "state" information about closed
connections for 240 seconds, to ensure that stray packets from old
connections won't be interpreted as valid data by a later
connection. When a server is handling a large number of
connections, this can require huge buffer space, and is very
inefficient.
- HTTP/1.0 has very limited support for caching.
Because of these aspects, HTTP 1.0 is gradually being replaced by
HTTP 1.1.
(rfc2616).
HTTP/1.1 Basics
HTTP/1.1 (rfc2616) is now
in widespread use. It extends the older protocol in a number of areas,
notably persistent connections/pipelining and
support for caching.
To implement Persisent Connections, HTTP/1.1
introduced a new request (and also response) header called
"Connection:
". This can take two values:
"close
" (which means that this is
not a persistent connection) and
"keep-alive
", which means that the TCP connection is
held until either side sends a
"Connection: close
" header, indicating that it
wishes to terminate.
The browser can utilise a persistent connection by sending multiple
requests over the connection without stopping and waiting for each them
to be satisfied before sending the next -- the reponses are "in the
pipeline". Similarly, the server can respond with responses sent one
after another another. This is possible because each request can be
unambiguously identified, as can the responses, using the
"Content-length:
" headers. The huge wins here,
obviously, are that there's no delay opening multiple TCP connections,
and the slow-start algorithm has time to get up to full speed.
Support for Caching in HTTP/1.1
The World Wide Web has been spectacularly successful -- so successful
that a huge proportion of Internet traffic is HTTP, ie Web pages and
related objects (eg, images). Caching is a technique
whereby copies of popular objects are kept in strategic locations, and
supplied in lieu of the originals, saving huge amounts of
traffic on the "backbone networks".
The Conditional-GET operation seen earlier allows
support for caching at the browser level -- that is,
the browser can keep a local copy of an object and check if it's up to
date before displaying it.
HTTP/1.1 adds support for proxy server caches. A proxy
server is an HTTP server which fetches Web objects (pages, images, etc)
on behalf of its clients. Requested objects are always specified as
full URLs in HTTP/1.1, so the first line of a GET
request now looks like:
GET http://www.bendigo.latrobe.edu.au/index.html HTTP/1.1
....other request headers...
Implementing Caching
It's obvious that the proxy server can check its local cache to see if
a requested object has recently been fetched. It is slightly more
subtle to discover if it's actually the same object.
HTTP/1.1 adds some new response headers to ensure that caching works
correctly:
Expires: Wed, 26 Mar 2003 02:22:52 GMT
Pragma: no-cache
- An object can be marked as having a limited lifetime, and once
the specified date/time has passed must be re-fetched from the
originating server. Also, an object can be flagged as
"un-cachable". These were both present in HTTP/1.0.
Etag: "8802-2c72-3ab178fc"
- This is the "Entity Tag", and is used to discover, with
somewhat greater certainty, if the object (or entity) in the
local cache is exactly the same object (eg,
isn't different in any way) as the object stored on the remote
server. The client can use an
If-None-Match: "8802-2c72-3ab178fc"
header with a GET request to specify the version of the object
which it already has. This is a significant improvement over
the HTTP/1.0 "Conditional-GET". Note that HTTP/1.1 has a large
number of other operations which can be used with Entity Tags.
You can discover lots more about HTTP at:
http://www.w3.org/pub/WWW/Protocols/Specs.html
The tutorial for this lecture is
Tutorial #7.
[Previous Lecture]
[Lecture Index]
[Next Lecture]
Copyright © 2003 by
Philip Scott,
La Trobe University.