The World Wide Web
Of all the Big Ideas in computer networking, the invention of the
World Wide Web
(also called the WWW,
or just the Web) would have to be the biggest.
History:
- 1989
- original proposal from Tim Berners-Lee at
CERN for a "Web" of
linked documents. Prototype followed soon after.
- December 1991
- First public demonstration.
- February 1993
- Mosaic
released by NCSA.
- February 1994
- We (Division of IT) start running a Web server
on machine ironbark.
at Bendigo (first regional institute
in Australia to do so, and in the first 10 nationally!) Rah, Rah!
WWW Architecture
Four key components:
- Browser software
(eg Netscape,
IE,
Mosaic,
Opera,
iCab,
lynx,
Amaya,
or even Emacs/W3).
- Web server software. The most popular server program is
apache
--
this is what we run on
ironbark.
- A collection of "hyperlinked" documents (or pages)
written in HTML
(the HyperText Markup Language). We examine some
aspects of HTML in today's lecture.
- The HyperText Transfer Protocol, HTTP. The browser
uses HTTP to obtain HTML documents, specified using a URL,
from a server.
HTML
Although it is not "core" knowledge in this unit, we really need to mention
HTML.
HTML is a markup language.
- The structure of a HTML document (or Web page) is
described using embedded formatting codes (or tags)
intermingled with the information in the document.
- The HTML, the markup tags are delimited by the special characters
"<" and ">". If either of these
characters must appear as part
of the actual data, they are written as < and > respectively.
- Philosophical note: it is (usually) regarded as a mistake to use HTML
to define the format (or appearance) of a document. One
of the fundamental ideas of the Web is that the document creator cannot
control how the browser will display it. Furthermore, defining only the
document's structure allows us to extract information in a way which
would otherwise be quite difficult.
- A HTML document is (despite sometimes looking rather formidable) plain
ASCII text.
The HTML Document
The originators of the Web never envisaged that humans would write HTML
"by hand". However, it's so easy that using the various "Web Authoring"
packages seems tedious for many of us!
A HTML document (or page) consists of two mandatory parts, a HEAD and a BODY.
The HEAD section must (at least) contain a document TITLE. The document
structure looks like:
<HTML>
<HEAD>
<TITLE>Title of Document</TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF">
HTML document (or page) content goes in here.
</BODY>
</HTML>
Note the "start" and "end" tags for the various sections of the document.
Note also that the author can use any convenient layout for the HTML document:
it's ignored by the browser!
HTML Markups
It is possible to create perfectly useful Web documents using only a small
fraction of the available HTML codes. Some of the more useful basic tags
include:
- Headings
- HTML defines 6 levels of heading:
<H1> through to <H6>. <H1> is the largest, and <H6>
is the smallest. Headings should be used to break your document into
sections and subsections. Example:
<H1>Introduction</H1>
- Text Spacing
- Use <P> to force a new paragraph, and <BR>
to force a line break.
- Emphasis
- Severely overused! HTML allows both "logical" emphasis:
<EM>text</EM> and <STRONG>text</STRONG>, as well as
"physical"
emphasis: <I>text</I> and <B>text</B>. Philosophical
note: Use of the logical emphasis tags is usually recommended...
- Preformatted text, etc
- Use <kbd>text</kbd> to indicate "typewritten"
text. The <CODE>text</CODE> tag is equivalent. Use
<PRE>lines of
text</PRE> for multi-line preformatted text.
- Misc
- Use <HR> to insert a horizontal line. Use
<CITE>text</CITE> or <BLOCKQUOTE>text</BLOCKQUOTE>
for quotations.
HTML List Tags
There are three commonly used list formats in HTML:
- Unnumbered (or bulleted) lists use the <UL> tag:
-
<UL>
<LI>List item
<LI>Another list item
</UL>
- Numeric (or numbered) lists use the <OL> (ordered list) tag:
-
<OL>
<LI>First list item
<LI>Second list item
</OL>
- Description Lists use the <DL> tag:
-
<DL>
<DT>Title of first item
<DD>Data for first item
<DT>Title of second item
<DD>Data for second item
</DL>
Images and Hyperlinks
To include a graphic image into a Web document, the <IMG> markup is
used. It can take several optional parameters, and can refer to a
variety of image formats. Basic usage is:
<IMG SRC="photo.gif">
It could also look something like:
<IMG SRC="photo.gif" align=right width=200 height=100 alt="Photo">
The hyperlink is the basis of the entire "linked documents"
idea of the Web. The HTML looks like:
<A HREF="http://www.latrobe.edu.au/index.html">La Trobe University</A>
In a browser, this would normally be displayed something like:
La Trobe University
When the user moves the cursor (mouse pointer) over this text, and
"clicks", the browser fetches the URL named in the HREF parameter and
displays it instead of the current page.
New Developments in HTML
- The current "official" version of HTML is 4.0
(see http://www.w3.org/Markup/
for details). It's now a very large and complicated language, and it's
clear that ordinary punters will have difficulty in using all of it!
- HTML 4.0 introduces the idea of Cascading Style Sheets
which specify in detail how the author would prefer the markup tags to
be interpreted. In general, these are a good thing...
- The eXtensible Markup Language (XML) provides a new
basis for developing new markup languages in the Web (of which HTML is
just one), and should allow the Web to evolve into a much richer
environment...
Oher Resources
Here's a great 10
minute introduction to HTML. It's very similar to
the content of today's lecture. For an excellent HTML tutorial, have
a look at Mal
Sutherland's HTML notes. The NCSA people also have their
NCSA
(at UIUC) Beginner's Guide to HTML. Another excellent resource is the
Web Design Group.
This lecture is also available in PostScript
format.
The tutorial for this lecture is Tutorial #05.
[Previous Lecture]
[Lecture Index]
[Next Lecture]
Phil Scott