An initial HTTP attempt to access a "password protected" Web page of this type (without providing suitable "authentication" information) will generate an HTTP error message together with a Web page which explains the nature of the error. Typically the response headers will contain:![]()
In HTTP/1.0, only theHTTP/1.1 401 Authorization Required Date: Wed, 17 Mar 2004 01:17:56 GMT Server: Apache/1.2.6 WWW-Authenticate: Basic realm="ByPassword" Last-Modified: Mon, 15 Mar 2004 00:43:51 GMT ....etc....
Basic authentication method
was available, as used in this example.
Upon receiving this error, the Web browser will normally pop up a
dialog box similar to the above, collect a user-ID and password from
the user, and then retry the request with an additional
"Authorization: " request header containing the
additional information.
Authorization Request Header
Let's use as an example, a page for which the username is
"student", password "student" --
pretty typical :-). The concantenation is thus
"student:student". We can use the Unix commandline
base64 program mimencode to encode the data, (it
encodes to "c3R1ZGVudDpzdHVkZW50") so that the
request header will look something like:
This, of course, begs the obvious question -- why on earth do they do this? The obvious answer is "for security reasons" -- to deter casual network snoopers who might be observing traffic, watching for passing user-IDs and passwords. We are left wondering...GET /subjects/int21cn/test/index.html HTTP/1.0 Authorization: Basic c3R1ZGVudDpzdHVkZW50 ....etc....
A browser which is "cookie-enabled" will normally[1] store this name/value pair, and future requests to the same server will contain an additional request header, thus:HTTP/1.0 200 OK Set-cookie: myname=myvalue ....etc...
Cookies are extensively used in Web session management, which is discussed later in the unit.GET /somefile.html HTTP/1.0 Cookie: myname=myvalue ....etc...
[1] In fact, cookie
operation is rather more complex than we discuss here -- for example,
the "Set-cookie: " header can take several
additional parameters (which affect how the cookie is interpreted),
and the behaviour of browsers with respect to cookies can be changed
by the end-user.
A form in HTML is an area of a Web page which is used to gather input
from a human user. The information which is gathered can then be
returned to the page's owner using a SUBMIT
action.
The form is, as expected, delimited by a <FORM> and
</FORM> markup pair.
The <FORM> markup has two important attributes:
ACTION
METHODACTION URL is accessed. There are two methods,
GET and POST.
<FORM ACTION="http://ironbark.bendigo.latrobe.edu.au/cgi-bin/myprog" METHOD="GET">
INPUT
tags. Each INPUT tag has an associated TYPE
attribute.For example:
This<INPUT TYPE="TEXT"
INPUT type can take several further
attributes, eg:
In a browser, this would be presented as a (scrollable) textbox, 20 characters wide (but able to accept 64 characters of input).<INPUT TYPE="TEXT" NAME="Name" MAXLENGTH="64" SIZE="20">
There are several other INPUT types:
TYPE="PASSWORD"
TYPE="CHECKBOX"
TYPE="RADIO"
TYPE="IMAGE"
TYPE="HIDDEN"
TYPE="SUBMIT"
TYPE="RESET"
SELECTOPTION markup tag, which can take a couple of
extra attributes.
TEXTAREAROWS and COLS and
can have a NAME attribute and an initial value.
...or simply "URL-encoded". In this format:application/x-www-form-urlencoded
+" character. This is a hangover from an older
format and is normally, but not universally, used -- see next
point.
%HH, where the H characters
are the two hexadecimal digits of the byte. Sometimes the space
character is also sent in this format, as
"%20", instead of as "+".
name=value,
with each name-value pair separated by the
"&" (ampersand) character.
METHOD=GET and METHOD=POST.
GET
GET request is issued to the
ACTION URL specified in the
<FORM> markup tag, with the
urlencoded form information appended after a separating
"?" character. This can generate
very long URLs.
POST
POST
transaction is performed. The "body" of the transaction
contains the urlencoded form data, as a single long line of
text. The POST transaction is directed at the URL specified in
the ACTION attribute of the
<FORM> tag.
In "real life", GET and POST
methods are used pretty much interchangeably, depending on the
programmer's or system designer's preference.
GETSubmit button, you
should pay close attention to two things:
? character.
The HTML for our FORM looks like:
This is rendered in your Web browser as:<FORM action="/subjects/int21cn/cgi/L06CGIa.cgi" method="GET"> info1: <INPUT type="text" name="info1" size="20"><br> info2: <INPUT type="text" name="info2" size="20"><br> <input type="submit" value="Submit"> <input type="reset" value="Clear Form"> </FORM>
Try it!
POSTIn this case, we're going to try something different -- the CGI program which is the target of this Form is going to show us the actual HTTP request as it was received[2].
Again, try it.
[2] Actually, it's a "reconstructed" version of the HTTP request: not all request headers are necessarily shown. But it's close enough for our purposes!
When a user clicks the SUBMIT button on a form, the
HTTP server starts up the specified CGI program, and makes the form
data available to it.
From a programming perspective, the difference between
GET and POST is the way in
which a CGI program receives the
form data. If the method was GET, the information
is usually obtained by examining the contents of an environment
variable (usually called "QUERY_STRING)
containing the URL-encoded form data. Other environment variables
contain additional useful information.
If the method was POST, the CGI program usually
receives the form data on its standard input stream,
with any extra stuff obtained, as before, from environment
variables.
CGI programs can, as a rule, be written in any language (compiled or interpreted) supported on the system running the HTTP server.
On Unix servers, they are commonly written in Perl,
C or as Bourne shell (/bin/sh)
scripts.
A CGI program (almost) always generates (to standard output) a Web page which is returned to the browser, in addition to any other effect.
[Previous Lecture]
[Lecture Index]
[Next Lecture]