Subjects ->
Computer Networks ->
Lectures ->
Lecture #06
Lecture 6: Applications #3.1: HTML and HTTP Basics
The World Wide Web
Of all the Big Ideas in computer networking, the invention of the World Wide Web (also
called the WWW, or just the Web)
would have to be the biggest.
History:
- 1989
- original proposal from Tim Berners-Lee at CERN for a "Web" of linked documents.
Prototype followed soon after.
- December 1991
- First public demonstration.
- February 1993
- Mosaic
(first alpha version) released by NCSA. First fully
operational, multiplatform version released in September.
Awareness of WWW project growing.
- February 1994
- We (Division of IT) start running a Web server on machine ironbark. at
Bendigo (first regional institute in Australia to do
so, and in the first 10 nationally!) Rah, Rah!
- Early 1995
- Netscape Communications
releases Netscape
Navigator 1.1. The rest is, as they say, history.
WWW Architecture
Four key components:
- Browser software
(eg Netscape,
IE,
Mosaic,
Opera,
iCab,
lynx,
Amaya,
or even (for the truly desperate)
Emacs/W3).
- Web server software. The most popular server program is
apache
-- this is
what we run on ironbark.
- A collection of "hyperlinked" documents (or pages)
written in HTML (the HyperText Markup
Language). We examine some aspects of HTML in today's
lecture.
- The HyperText Transfer Protocol, HTTP. The browser
uses HTTP to obtain HTML documents, specified using a
URL, from a server.
Digression: HTML
Although it is not "core" knowledge in this unit, we really need to mention
HTML. The following is a quick "mini-tutorial" on HTML for the small
proportion of students who have no experience in it...
HTML is a markup language.
- The structure of a HTML document (or Web page) is
described using embedded formatting codes (or
tags) intermingled with the information in the
document.
- The HTML, the markup tags are delimited by the special characters
"
<
" and
">
". If either of these characters
must appear as part of the actual data, they are written as
< and > respectively.
- Philosophical note: it has been regarded as a mistake to use HTML
to strongly enforce a particular layout (or
appearance) of a document. One of the fundamental ideas of the Web
is that the document creator cannot control how the browser will
display it. Furthermore, defining only the document's structure
allows us to extract information in a way which would otherwise be
quite difficult. See also CSS, later.
- A HTML document is (despite sometimes looking rather formidable)
lines of plain ASCII text.
The HTML Document
The originators of the Web (reportedly) never envisaged that humans
would write HTML "by hand". However, it's so easy that using the
various "Web Authoring" packages seems tedious for many of us!
A HTML document (or page) consists of two mandatory parts, a HEAD and a
BODY. The HEAD section must (at least) contain a document TITLE, and
should normally be preceded by a "document type" specifier. The overall
document structure looks like:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<TITLE>Title of Document</TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF">
HTML document (or page) content goes in here.
</BODY>
</HTML>
Note the "start" and "end" tags for the various sections of the
document. Note also that the author can use any convenient layout for
the HTML document: it's ignored by the browser!
HTML Markups
It is possible to create perfectly useful Web documents using only a
small fraction of the available HTML codes. Some of the more useful
basic tags include:
- Headings
- HTML defines 6 levels of heading: <H1> through to
<H6>. <H1> is the largest, and <H6> is the
smallest.
- Text Spacing
- Use <P> to force a new paragraph, and <BR> to force
a line break.
- Emphasis
- Severely overused! HTML allows both "logical" emphasis:
<EM>text</EM> and <STRONG>text</STRONG>, as
well as "physical" emphasis: <I>text</I> and
<B>text</B>. Philosophical note: Use
of the logical emphasis tags is usually recommended...
- Preformatted text, etc
- Use <kbd>text</kbd> to indicate "typewritten" text.
The <CODE>text</CODE> tag is equivalent. Use
<PRE>lines of text</PRE> for multi-line preformatted
text.
- Misc
- Use <HR> to insert a horizontal line. Use
<CITE>text</CITE> and/or
<BLOCKQUOTE>text</BLOCKQUOTE> for quotations.
HTML List Tags
There are three commonly used list formats in HTML:
- Unnumbered (or bulleted) lists use the <UL> tag:
-
<UL>
<LI>List item
<LI>Another list item
</UL>
- Numeric (or numbered) lists use the <OL> (ordered list) tag:
-
<OL>
<LI>First list item
<LI>Second list item
</OL>
- Description Lists use the <DL> tag:
-
<DL>
<DT>Title of first item
<DD>Data for first item
<DT>Title of second item
<DD>Data for second item
</DL>
Images and Hyperlinks
To include a graphic image into a Web document, the <IMG> markup
is used. It can take several optional parameters, and can refer to a
variety of image formats. Basic usage is:
<IMG SRC="photo.gif">
It could also look something like:
<IMG SRC="photo.gif" align=right width=200 height=100 alt="Photo">
The hyperlink is the basis of the
entire "linked documents" idea of the Web. The HTML looks like:
<A HREF="http://www.latrobe.edu.au/index.html">La
Trobe University</A>
In a browser, this would normally be displayed something like:
La Trobe
University
When the user moves the cursor (mouse pointer) over this text, and
"clicks", the browser fetches the URL named in the HREF parameter and
displays it instead of the current page.
New Developments and Directions For HTML
- The current "official" version of HTML is 4.01, now re-specified as
XHTML 1.0 using XML (see later). (see http://www.w3.org/Markup/ for
details). It's now a very large and complicated language, and it's
clear that ordinary punters will have difficulty in using all of
it!
- HTML 4.0 introduced the idea of Cascading Style
Sheets which specify in detail how the author would prefer
the markup tags to be interpreted. These are a good thing, but not
(yet) widely used.
- The eXtensible Markup Language (XML) provides a
new basis for developing new markup languages in the Web (of which
XHTML is just one), and should allow the Web to evolve into a much
richer environment. We examine XML later.
Other HTML Resources
The WWW Consortium has a great 10 minute introduction to
HTML. It's very similar to the content of today's
lecture. For an excellent HTML tutorial, have a look at the first few
lectures for the La Trobe University, Bendigo subject Web
Development. The NCSA people also have their NCSA
(at UIUC) Beginner's Guide to HTML. Another excellent resource is
the Web Design Group.
Hypertext Transfer Protocol (HTTP)
In Lecture #2,, the World Wide Web
was used to illustrate the idea of a layered communications
architecture. In that lecture, the basic ideas of the original
version (0.9, circa 1992) of HTTP were introduced. To revise, in
HTTP 0.9 the GET operation was used to
obtain HTML "pages" from a server, eg:
GET /Index.html
<HTML>
<HEAD>
<TITLE>La Trobe University</TITLE>
</HEAD>
<BODY>
<h2>La Trobe University</H2>
..........etc
HTTP 0.9 actually defined a few other operations. However, since HTTP
1.0 (RFC 1945) and HTTP
1.1 are now commonly used, we shall defer discussion of them.
The tutorial for this lecture is
Tutorial #06.
[Previous Lecture]
[Lecture Index]
[Next Lecture]
Copyright © 2001 by
Philip Scott,
La Trobe University.