ToC ~ Up ~ Prev ~ Next ~ Index Introduction to HTML
Last Update: 5 January 1998

9.1 Using ISINDEX for server-side searches

Doing searches using the ISINDEX element is not difficult, but can appear tricky at first, but is clearer once you remember two basic things:

This interface program between the database tools programs and the HTTP server is a script or program, usually placed in the server's "cgi-bin" directory. These scripts, known as gateway programs, are accessed via URLs such as http://www.foo.com/cgi-bin/foo, where foo is the name of the script or program, and the /cgi-bin/ path is a special path that references the directory containing the special programs and scripts that can be executed by the server. The name does not need to be /cgi-bin/,and in fact many sites have many different gateway program directories, each directory reflecting a particular project or task.

9.1.1 Example usage of ISINDEX

I have the file /u/www/Webdocs/Personnel on my http server. I want to allow someone to search this file for names, using a WWW browser, and I want to do this using the ISINDEX element.

1. The Server-side Script

Sorry: An executable version of this test program is not availble at this web site

Step one is to create a script to interface the server and browser with the search program (here a program named grep). My script is srch-example, which is found in my server's cgi-bin directory.

This program can be accessed from this server via the URL
http://www.java.utoronto.ca/cgi-bin/srch-example -- try it yourself, if you want. The following is the content of srch-example:

#!/bin/sh
echo Content-type: text/html
echo
if [ $# = 0 ]
then
  echo "<HEAD>"
  echo "<TITLE>UTIRC Phonebook Search</TITLE>"
  echo "<ISINDEX>"
  echo "</HEAD>"
  echo "<BODY>"
  echo "<H1>UTIRC Phonebook Search</H1>"
  echo "Enter your search in the search field.<P>"
  echo "This is a case-insensitive substring search: thus"
  echo "searching for 'ian' will find 'Ian' and Adriana'."
  echo "</BODY>"
else
  echo "<HEAD>"
  echo "<TITLE>Result of search for \"$*\".</TITLE>"
  echo "</HEAD>"
  echo "<BODY>"
  echo "<H1>Result of search for \"$*\".</H1>"
  echo "<PRE>"
  grep -i "$*" /u/www/Webdocs/Personnel
  echo "</PRE>"
  echo "</BODY>"
fi

2. How Does This Script Work?

We assume that someone (maybe you?) has accessed the URL

http://www.utoronto.ca/cgi-bin/srch-example

What happens? When the script is accessed it always prints the line Content-type: text/html. This is sent to the HTTP server, which in turn forwards it back to the browser. This particular line is a MIME content-type header, and tells the browser what type of data is being sent back. Here, this line tells the browser to expect a text/html document.

3. ISINDEX Signals a Search

the if statement checks to see if there are any command-line arguments to the script -- that is, whether the program was launched as if it were typed in as:

    srch-example arg1 arg2 

Arguments are passed from the browser to the server script via the URL: arguments are added to the end of the URL, separeted from the regular URL by a question mark. In our case there are no arguments, so we execute the first branch of the if. This section of the program echoes some standard HTML markup, and then sends the ISINDEX element. This tells the browser that this is a search, and that it should prompt the user for text input.

The browser display the received document and prompts the user for a search string. For example, Mosaic will present a fill-in template, where you type the desired search string. When you press return, the browser re-accesses the same URL as before, but this time appends the search string to the URL.

For example, if I filled in the form with my name (Ian) the accessed URL is now

http://www.utoronto.ca/cgi-bin/srch-example?ian

4. Second URL Access: Search Results

The above URL again accesses the program srch-example, but this time with an argument (ian), so that the second branch of the if is executed. This branch echoes new headings, indicating what was searched for, and runs the grep program to search the file. By default the output of grep is echoed, so the search results are sent to the browser. ISINDEX is NOT added here, as this branch provides the results of the search, byt does not contain a second box for user input. The returned result is a document containing the search results.

5. A Demonstration

That is, briefly, the whole story. If you've patiently read until now, you can test this example and see this script in action by accessing a appropriate test URL.

Data Encoding in URLs

The data typed in by the user must be specially encoded when placed in a URL, to avoid possible misinterpretation (i.e., accidentally breaking a URL at a space character). In addition, text input by the user into an ISINDEX query box is also encoded, to ensure safe transmission. The encoding mechanisms are rather complicated, and involve converting what are called "unsafe" ISO-Latin 1 characters into their ocatl encodings. For example, the space character becomes the code "%20". (Percent followed by the hexadecimal number code corresponding to the space character). For the details of how this all works you should either read my book (O.K, that was a cheap plug), or consult the detailed on-line documentation on URLs.


ToC ~ Up ~ Prev ~ Next ~ Index Introduction to HTML
© 1994-1998 by Ian Graham
Last Update: 5 January 1998