The Common Gateway Interface (CGI)
What is CGI programming anyway? What is the BIG DEAL?? And why the heck is it called a gateway?
Very good questions. Ones that bugged me early on and ones that still seem to be asked quite frequently.
CGI programming involves designing and writing programs that receive their starting commands from a Web page-usually, a Web page that uses an HTML form to initiate the CGI program. The HTML form has become the method of choice for sending data across the Net because of the ease of setting up a user interface using the HTML Form and Input tags. With the HTML form, you can set up input windows, pull-down menus, checkboxes, radio buttons, and more with very little effort. In addition, the data from all these data-entry methods is formatted automatically and sent for you when you use the HTML form. You learn about the details of using the HTML form in Tutorial 4, “Using Forms to Gather and Send Data,” and Tutorial 5, “Decoding Data Sent to Your CGI Program.”
CGI programs don’t have to be started by a Web page, however. They can be started as the result of a Server Side Include (SSI) execution command (covered in detail in Tutorial 3 “Using Server Side Include Commands”). You even can start a CGI program from the command line. But a CGI program started from the command line probably will not act the way you expect or designed it to act. Why is that? Well, a CGI program runs under a unique environment. The WWW server that started your CGI program creates some special information for your CGI program, and it expects some special responses back from your CGI program.
Before your CGI program is initiated, the WWW server already has created a special processing environment for your CGI program in which to operate. That environment includes translating all the incoming HTTP request headers (covered in Tutorial 2 “Understanding How the Server and Browser Communicate”) into environment variables (covered in Tutorial 6 “Using Environment Variables in Your Programs”) that your CGI program can use for all kinds of valuable information. In addition to system information (such as the current date), the environment includes information about who is calling your CGI program, from where your program is being called, and possibly even state information to help you keep track of a single Web visitor’s actions. (State information is anything that keeps track of what your program did the last time it was called.)
Next, the server tries to determine what type of file or program it is calling because it must act differently based on the type of file it is accessing. So, your WWW server first looks at the file extension to determine whether it needs to parse the file looking for SSI commands, execute the Perl interpreter to compile and interpret a Perl program, or just generate the correct HTTP response headers and return an HTML file.
After your server starts up your SSI or CGI program (or even HTML file), it expects a specific type of response from the SSI or CGI program. If your server is just returning an HTML file, it expects that file to be a text file with HTML tags and text in it. If the server is returning an HTML file, the server is responsible for generating the required HTTP response headers, which tell the calling browser the status of the browser’s request for a Web page and what type of data the browser will be receiving, among other things
The SSI file works almost like a regular HTML file. The only difference is that, with an SSI file, the server must look at each line in the file for special SSI commands. If it finds an SSI command, it tries to execute it. The output from the executed SSI command is inserted into the returned HTML file, replacing the special HTML syntax for calling an SSI command. The output from the SSI command will appear within the HTML text just as if it were typed at the location of the SSI command. SSI commands can include other files, execute system commands, and perform many useful functions. The server uses the file extension of the requested Web page to determine whether it needs to parse a file for SSI commands. SSI files typically have the extension .shtml.
If the server identifies the file as an executable CGI program, it executes the program as appropriate. After the server executes your CGI program, your program normally responds with the minimum required HTTP response headers and then some HTML tags. If your CGI program is returning HTML, it should output a response header of Content-Type: text/html. This gives the server enough information to generate any other required HTTP response headers.
After all that explanation, what is CGI programming? CGI programming is writing the programs that receive and translate data sent via the Internet to your WWW server. CGI programming is using that translated data and understanding how to send valid HTTP response headers and HTML tags back to your WWW client.
The big deal in all this is a brand new dynamic programming environment. All kinds of new commerce and applications are going to occur over the Internet. You can’t do this with just HTML. HTML by itself makes a nice window, but to do anything more than look pretty requires programming, and that programming must understand the CGI environment.
Finally, just why is it called gateway? Quite often, your program acts as a gateway or interface program between other, larger applications. CGI programs often are written in scripting languages such as Perl. Scripting languages really are not meant for large applications. You might create a program that translates and formats the data being sent to it from applications such as online catalogs, for example. This translated data then is passed to some type of database program. The database program does the necessary operations on its database and returns the results to your CGI program. Your CGI program then can reformat the returned data as needed for the Internet and return it to the online catalog customer, thus acting as a gateway between the HTML catalog, the HTTP request/response headers, and the database program. I’m sure that you can think of other, cooler examples, but this one probably will be pretty common in the near future.
You already can see a lot of interaction between the HTTP request/response headers, HTML, and your CGI programs. Each of these topics is covered in detail in these tutorials, but you should understand how these pieces fit together to create the entire CGI environment.