Subjects -> Computer Networks -> Lectures -> Lecture #7

Lecture 7: Internet Applications #3.3: HTTP/1.1

A nice short lecture this time -- we tie up a few last topics in HTTP.

HTTP/1.0 Performance Issues

HTTP/1.0 has been criticised for poor performance and lack of scalability. There are several aspects to this:

HTTP/1.0 opens a new TCP connection for every single transaction. For example, if a Web page contains 10 <img ...> images, HTTP/1.0 must open a total of 11 TCP connections -- one for the original page, and 10 for the images. Problems which arise from this include:
- There can be a moderately long overhead in initial connection establishment due to round-trip delays.
- TCP initiates connections using the so-called "slow start" algorithm. This is necessary for proper operation, but is very inefficient for short transfers -- TCP typically takes 10 to 20KB of transferred data to get "up to full speed". Both of these can cause HTTP/1.0 browsers to seem really slow.
TCP is required to maintain "state" information about closed connections for 240 seconds, to ensure that stray packets from old connections won't be interpreted as valid data by a later connection. When a server is handling a large number of connections, this can require huge buffer space, and is very inefficient.
HTTP/1.0 has very limited support for caching.

Because of these aspects, HTTP 1.0 is gradually being replaced by HTTP 1.1. (rfc2616).

HTTP/1.1 Basics

HTTP/1.1 (rfc2616) is now in widespread use. It extends the older protocol in a number of areas, notably persistent connections/pipelining and support for caching.

To implement Persisent Connections, HTTP/1.1 introduced a new request (and also response) header called "Connection:". This can take two values: "close" (which means that this is not a persistent connection) and "keep-alive", which means that the TCP connection is held until either side sends a "Connection: close" header, indicating that it wishes to terminate.

The browser can utilise a persistent connection by sending multiple requests over the connection without stopping and waiting for each them to be satisfied before sending the next -- the reponses are "in the pipeline". Similarly, the server can respond with responses sent one after another another. This is possible because each request can be unambiguously identified, as can the responses, using the "Content-length:" headers. The huge wins here, obviously, are that there's no delay opening multiple TCP connections, and the slow-start algorithm has time to get up to full speed.

Support for Caching in HTTP/1.1

The World Wide Web has been spectacularly successful -- so successful that a huge proportion of Internet traffic is HTTP, ie Web pages and related objects (eg, images). Caching is a technique whereby copies of popular objects are kept in strategic locations, and supplied in lieu of the originals, saving huge amounts of traffic on the "backbone networks".

The Conditional-GET operation seen earlier allows support for caching at the browser level -- that is, the browser can keep a local copy of an object and check if it's up to date before displaying it.

HTTP/1.1 adds support for proxy server caches. A proxy server is an HTTP server which fetches Web objects (pages, images, etc) on behalf of its clients. Requested objects are always specified as full URLs in HTTP/1.1, so the first line of a GET request now looks like:

GET http://www.bendigo.latrobe.edu.au/index.html HTTP/1.1
....other request headers...

HTTP Proxy Server
system diagram

Implementing Caching

It's obvious that the proxy server can check its local cache to see if a requested object has recently been fetched. It is slightly more subtle to discover if it's actually the same object. HTTP/1.1 adds some new response headers to ensure that caching works correctly:

Expires: Wed, 26 Mar 2003 02:22:52 GMT Pragma: no-cache: An object can be marked as having a limited lifetime, and once the specified date/time has passed must be re-fetched from the originating server. Also, an object can be flagged as "un-cachable". These were both present in HTTP/1.0.
Etag: "8802-2c72-3ab178fc": This is the "Entity Tag", and is used to discover, with somewhat greater certainty, if the object (or entity) in the local cache is exactly the same object (eg, isn't different in any way) as the object stored on the remote server. The client can use an If-None-Match: "8802-2c72-3ab178fc"
header with a GET request to specify the version of the object which it already has. This is a significant improvement over the HTTP/1.0 "Conditional-GET". Note that HTTP/1.1 has a large number of other operations which can be used with Entity Tags.

You can discover lots more about HTTP at: http://www.w3.org/pub/WWW/Protocols/Specs.html

The tutorial for this lecture is Tutorial #7.
La Trobe Uni Logo

[Previous Lecture] [Lecture Index] [Next Lecture]