Internet

Beginning CGI Programming with Perl

Server Side Includes

In the preceding Tutorial, you learned about the environment of CGI programming and how the server communicates with the browser. Today, without using any special programming languages, gotos, if then else statements, or any other complex programming structures, you will learn how to build dynamic Web pages. In this Tutorial, you will discover Server Side Include commands (SSIs). In particular, you will look at these topics:

  • Looking at the downside of SSIs
  • Making SSIs work on your server
  • Looking at the format of SSIs
  • Changing the format of SSIs
  • Including other files in your Web page
  • Adding the size and last modification date of your Web files
  • Executing system commands from within your parsed HTML files
  • Deciding whether SSIs are a security risk

This transition from an unchanging Web page to a Web page that can interact with your Web client can begin with very little programming expertise.

Instead of writing code to perform dynamic and useful tasks, you can use commands called Server Side Includes. Server Side Includes are special HTML-like commands that your server executes for you as it parses your HTML file.

Server Side Includes probably were started to handle the desire to include a common file inside a bunch of different files. The most common use for SSIs is providing a signature file or company logo that you want to add to every file you create. The Include file resides on the server and is included whenever any HTML file that contains the include command is requested, which is where the term Server Side Include comes from.

Using SSI Negatives

As with every other neat and cool thing you can do, SSIs are somewhat of a two-edged sword. The server has to do a lot more work to process these includes. When the server returns an HTML file, it generates the appropriate response headers and sends the HTML file back to the client. No fuss and very little work.

When the server executes a CGI program, a compiler or interpreter executes your program. Your CGI program should generate some HTTP response headers, and then the HTML file server’s job is to generate any additional required HTTP response headers and pass the CGI-generated HTML back to the client/browser.

When the server returns a file with SSI commands in it, however, it must read each line of the file looking for the special SSI command syntax. This is called parsing a file. SSI commands can appear anywhere in your HTML file. This means that your server must make a special effort to find the commands in your HTML file.

This parsing of files puts an extra burden on your server. That also means that SSI files are slower when returned to your Web client than regular HTML files. The more SSI files your server has to handle, the more processing load on your server, and, as a consequence, the slower your server operates. Do not let this stop you from using SSIs; just be aware of the cost and benefits of using SSI files.

At this point, you should be wondering how the server knows whether to parse a file looking for SSI commands. How does the server know what those commands look like, anyway? And do SSI commands work on every server?

First of all, special files on your server define whether SSI commands will be allowed on your system. And then other files exist that define which files will be parsed for SSI commands and which files will be treated as CGI programs.

Understanding How SSIs Work

The ncSA server-currently the most popular server on the Net-and several other HTTP servers support SSIs.

Next, SSIs have to be enabled by your System Administrator before they will work. SSIs require the server to do more work with every SSI document handled by the server. As you learned in the preceding Tutorial, the server is responsible for finding, reading, formatting, and outputting the headers and HTML files requested by the client. So the System Administrator for your server makes several decisions that affect whether you can use SSIs and how many of them are enabled for you.

Deciding Whether to Enable SSIs

The first decision is whether to allow SSIs at all on the server. For the most part, your local Internet provider wants to give you all the freedom it can on your server. So most System Administrators decide to turn on SSIs. Because of the extra burden placed on the server, however, limitations are placed on the types of files that can have SSI commands. This limitation is based on the ending characters of each filename, called the filename extension. Usually, it’s something like .shmtl. So any file that ends in .shmtl is handled as an SSI file by the server. You can set the filename extension by using the AddType directive in the srm.conf file, which is described later in the section “Using the AddType Command for SSIs.”

In order for SSIs to work, the server has to read every line of every SSI file looking for the special SSI commands. A significant extra computing and disk-access burden is placed on any server that has to parse its files before sending them back to the client. Usually, that burden is not so great that SSIs are turned off. But if a site is very, very, very busy, and it cannot handle all the traffic it is getting, one way to deal with server overload is to turn off SSIs.

Using the Options Directive

In order to enable SSI commands at all, the various directories that can use SSI commands must be enabled. This is done by modifying a file called access.conf. The access.conf file controls each directory’s capability to execute different types of WWW services. In this case, you are interested in SSI commands. The access.conf file is discussed in detail in Tutorial 12, “Guarding Your Server Against Unwanted Guests.” Your current interest is in enabling SSI commands for your server. This is done with the Options directive.

On my server, the Options directive is set to All: Options All. This means that all features are enabled in the directory or directories identified with the Options All command. My server allows SSI commands in all directories under the document root. The document root consists of all the directories that are accessible to normal users and Web visitors. My life is a lot easier because of this, and it’s one of the reasons I use this server. If your server is not enabled so that you can use SSIs, send e-mail to your System Administrator or find another server.

If you are just interested in enabling SSI commands, you should set the Options directive to Includes: Options Includes. This enables all the available SSI features.

For security reasons, you may see your server set to

Options IncludesNoExec

This enables you to use SSIs but disables the SSI exec command.

The access.conf file and its directives are covered in detail in Tutorial 12, so accept this outline of how to set up SSIs on your server. For a complete tutorial on setting up an ncSA httpd server, see

http://hoohoo.ncsa.uiuc.edu/docs/tutorials

Using the AddType Command for SSIs

Now that you can add SSI commands to your directory, the server must decide whether to parse all files or just special files. Usually, the server limits SSI parsing to a special file type, as described previously. This is done by modifying the srm.conf file. The srm.conf file is usually in a directory named conf, below one of the top-level directories on your server. Conf stands for configuration, so all the files that manage the configuration of your server should be below the conf directory. This is not mandatory; it’s just neater.

Using the srm.conf File

In the conf directory, there should be a file called srm.conf. This is the file that decides which files will be parsed for SSI commands. Remember that your goal is to allow the use of SSI commands but to limit their impact on the server. Inside this file is the command AddType. The AddType command sets the filename extension type for various applications. Listing 3.1 shows a typical srm.conf file; this is a partial listing of the srm.conf file so that you can get a good feel for how the AddType command fits into the overall srm.conf file. Only a few of the commands have been deleted. These deleted commands were adding similar types and do not change the outline of the srm.conf file.

Listing 3.1. The srm.conf file.

01: DocumentRoot /usr/local/business/http/accn.com02: UserDir public-web03: DirectoryIndex blocked.html index.cgi index.html home.html welcome.html  index.htm04:05: FancyIndexing on06:07: AddIconByType (TXT,/icons/text.gif) text/*08: AddIconByType (IMG,/icons/image2.gif) image/*09: AddIconByType (SND,/icons/sound2.gif) audio/*10: AddIcon /icons/movie.gif .mpg .qt11: [additional ADDIcon commands deleted]12:13: DefaultIcon /icons/unknown.gif14: ReadmeName README15: HeaderName HEADER16: IndexIgnore */.??* *~ *#* */HEADER* */README*17: IndexOptions FancyIndexing18: AccessFileName .htaccess19: DefaultType text/plain20:21: AddLanguage en .en22: [additional ADDLanguage commands deleted]23:24: LanguagePriority en fr de25:26: AddEncoding x-compress Z27: AddEncoding x-gzip gz28:29: Alias /icons/ /usr/local/www/icons/30:31: ScriptAlias /cgi-bin/ /usr/local/business/http/accn.com/cgi-bin/32: ScriptAlias /mailto   /usr/local/www/cgi-bin/mailto.pl33: [additional ScriptAlias commands deleted]34:35: AddType text/x-server-parsed-html .shtml36: AddType application/x-httpd-cgi .cgi37: AddType image/gif .gif8738: AddType image/gif .gif8939:40: AddType text/x-server-parsed-html3 .shtml341: AddType httpd/send-as-is asis42: AddType application/x-type-map var43: AddType application/x-httpd-imap map

Toward the end of Listing 3.1, you can see several AddType commands. The first AddType command adds a subtype to the MIME text type. The AddType directive allows the server to add new MIME types or subtypes to its list of valid types. The MIME type tells the server what type of document it is managing. The srm.conf file is not responsible for telling the server about all the types it needs to handle. As you can see from Listing 3.1, however, several new types and subtypes have been added to the server’s basic types.

You should be interested in the x-server-parsed type. This is a subtype of the MIME text type. The beginning x in the subtype definition defines a new or experimental type. Any files with the extension shtml will be managed as x-parsed HTML files. So any file with the shtml extension will be parsed by the server.

Tips

DO name all files that include SSI directives with the extension defined in your srm.conf file. This usually is shtml.

DON’T use just any extension for your files that include SSI commands.

DO check out the srm.conf file. Look at the AddType directive to figure out what your SSI files should be named.