ToC ~ Up ~ Prev ~ Next ~ Index |
Introduction to HTML Last Update: 5 January 1998 |
As most HTML is served from HTTP (HyperText Transfer Protocol) servers this is the most common URL you are likely to see. Consider the following examples:
http://www.w3.org/hypertext/Addressing/ http://www.java.utoronto.ca:3232/home.html http://www.utoronto.ca/ian/books/html4ed/outline.html
What do these strings mean? The first part http: means that the documents are served by an http server. The double slash (//) means that the next part is the name of the server. This can have two parts, the internet address of the server (essential) and the port number the server listens at (optional). In the first example www.w3.org the port number is not specified, so the browser assumes the default number for http servers (Port 80). In the second case URL tells the browser that the http server is at port 3232. The port is specified after the server name, separated by a colon.
The final part specifies the file or resource being requested: this is separated from the address+port number pair by a slash (/). The resource is specified by a path relative to to the root directory of the server. Thus, in the third example, the document outline.html at www.utoronto.ca is found in the subdirectory .../ian/books/html4ed/ with respect to the HTTP server root.
A file or resource specification beginning with
/cgi-bin/
is usually special: in the case of many servers,
the cgi-bin
string indicates a special reference to
programs or scripts that can be executed by the
server. This is discussed in more detail in section 8.1.1.
If the file name is left out, the server tries to send you a default directory file. Usually this is a file named "index.html", but this default name can be modified (or turned off) by the server configuration files. You should always include the trailing slash if you are referencing a directory, for example /directory/ as otherwise the server will think you are requesting a file named directory as opposed to information about the directory.
The HTTP protocol support the passing of arguments to the server. The general format is to postpend the arguments to the URL, separated from the URL by a question mark (?). The reason for this notation is simple: most requests of this type are requests to search a database, and the passed arguments are the search parameters.
The general form is as follows:
http://some.site.edu/cgi-bin/foo?arg1+arg2+arg3
What does this mean? There are two things to note:
cgi-bin
directory is a special location
known to the server, containing executable programs or scripts.
The reason is obvious: you have to pass argument to something
that can act on those arguments, implying a program or script.
The cgi-bin
directory contains programs/scripts that
interface with the WWW - a URL can access and pass argument to
programs/scripts in this directory, and these programs/scripts can
in turn act on the arguments and return information, documents,
etc. to the browser.foo
is sent three
arguments, arg1
, arg
and
arg3
.For more information see the W3C documentation on addressing.
On many Web servers, users can have html documents in their own home directories, distinct from the special area reserved for administrative Web pages. The procedure for doing this depends on some degree on the server. In general the user needs to create a special file, placed in their home directory, that specifies where their personal 'root' html directory is. You then access files in this personal 'root' area by using a special URL path of the form: ~your_login_name/path/file, where the tilde (~) indicates that this is a 'personal' Web area. Again, this is a server-specific feature, and not all servers do this, or have this turned on. Ask your server manager for details about your local implementation.
ToC ~ Up ~ Prev ~ Next ~ Index |
Introduction to HTML © 1994-1998 by Ian Graham Last Update: 5 January 1998 |