ToC ~ Up ~ Prev ~ Next ~ Index |
Introduction to HTML Last Update: 5 January 1998 |
Doing searches using the ISINDEX element is not difficult, but can appear tricky at first, but is clearer once you remember two basic things:
This interface program between the database tools
programs and the HTTP server is a script or program, usually
placed in the server's "cgi-bin" directory. These scripts, known
as gateway programs, are
accessed via URLs such as http://www.foo.com/cgi-bin/foo
,
where foo
is the name of the script or program, and
the /cgi-bin/
path is a special path that references
the directory containing the special programs and scripts that
can be executed by the server. The name does not need to
be /cgi-bin/
,and in fact many sites have many different
gateway program directories, each directory reflecting a
particular project or task.
I have the file /u/www/Webdocs/Personnel
on my
http server. I want to allow someone to search this file for
names, using a WWW browser, and I want to do this using the
ISINDEX element.
Sorry: An executable version of this test program is not availble at this web site |
Step one is to create a script to interface the server and browser
with the search program (here a program named grep
).
My script is srch-example
, which is
found in my server's cgi-bin
directory.
This program can be accessed from this server via the URL
http://www.java.utoronto.ca/cgi-bin/srch-example
-- try it yourself, if you want.
The following is the content of srch-example
:
#!/bin/sh echo Content-type: text/html echo if [ $# = 0 ] then echo "<HEAD>" echo "<TITLE>UTIRC Phonebook Search</TITLE>" echo "<ISINDEX>" echo "</HEAD>" echo "<BODY>" echo "<H1>UTIRC Phonebook Search</H1>" echo "Enter your search in the search field.<P>" echo "This is a case-insensitive substring search: thus" echo "searching for 'ian' will find 'Ian' and Adriana'." echo "</BODY>" else echo "<HEAD>" echo "<TITLE>Result of search for \"$*\".</TITLE>" echo "</HEAD>" echo "<BODY>" echo "<H1>Result of search for \"$*\".</H1>" echo "<PRE>" grep -i "$*" /u/www/Webdocs/Personnel echo "</PRE>" echo "</BODY>" fi
We assume that someone (maybe you?) has accessed the URL
http://www.utoronto.ca/cgi-bin/srch-example
What happens? When the script is accessed it always prints
the line Content-type: text/html
. This is sent
to the HTTP server, which in turn forwards it back to the browser.
This particular line is a MIME content-type header, and tells the
browser what type of data is
being sent back. Here, this line tells the browser to
expect a text/html document.
the if
statement checks to see if there are
any command-line arguments to the script -- that is, whether the program
was launched as if it were typed in as:
srch-example arg1 arg2
Arguments are passed from the
browser to the server script via the URL: arguments are added to
the end of the URL, separeted from the
regular URL by a question mark.
In our case there are no arguments, so we execute the first branch
of the if
. This section of the program echoes some
standard HTML markup,
and then sends the ISINDEX element. This tells the browser that
this is a search, and that it should prompt the user for text input.
The browser display the received document and prompts the user for a search string. For example, Mosaic will present a fill-in template, where you type the desired search string. When you press return, the browser re-accesses the same URL as before, but this time appends the search string to the URL.
For example, if I filled in the form with my name (Ian) the accessed URL is now
http://www.utoronto.ca/cgi-bin/srch-example?ian
The above URL again accesses the program srch-example
,
but this time with an argument (ian), so that the second branch
of the if
is executed. This branch
echoes new headings, indicating what was searched for,
and runs the grep
program to search the file.
By default the output of grep
is echoed, so the
search results are sent to the browser. ISINDEX is NOT
added here, as this branch provides the results of the search, byt
does not contain a second box for user input.
The returned result is a document containing the search results.
That is, briefly, the whole story. If you've patiently read until now, you can test this example and see this script in action by accessing a appropriate test URL.
The data typed in by the user must be specially encoded when placed in a URL, to avoid possible misinterpretation (i.e., accidentally breaking a URL at a space character). In addition, text input by the user into an ISINDEX query box is also encoded, to ensure safe transmission. The encoding mechanisms are rather complicated, and involve converting what are called "unsafe" ISO-Latin 1 characters into their ocatl encodings. For example, the space character becomes the code "%20". (Percent followed by the hexadecimal number code corresponding to the space character). For the details of how this all works you should either read my book (O.K, that was a cheap plug), or consult the detailed on-line documentation on URLs.
ToC ~ Up ~ Prev ~ Next ~ Index |
Introduction to HTML © 1994-1998 by Ian Graham Last Update: 5 January 1998 |