This article assumes some basic understanding of how the HTTP protocol works.
When a request for a script is received by a webserver (and the webserver decides that it is permitted to run the script because it is in the correct location with the correct access rights) then the webserver will set a number of the environment variables (usually known as system variables on RISC OS) and execute the script.
If the method is POST (or PUT), then the script will have either the encoded form data or the uploaded file supplied on STDIN. The length of the supplied data will be indicated by the value of the CONTENT_LENGTH environment variable.
If the uri contains a '?', then all of the URI subsequent to the ? will be stored in the QUERY_STRING environment variable. (This should occur regardless of method - i.e. it should work for a form POSTed to script.cgi?var=val exactly the same as it would for a simple GET request for script.cgi?var=val)
A script will usually decode the QUERY_STRING and (if supplied) the data read from STDIN. It will do any processing required, (e.g. sending mail) and then output the returned page to STDOUT. It will write the return headers as well - i.e. including the Content-Type of the output, e.g
Content-Type: text/html
Sample Output
Sample Output
This is the output of a sample CGI script
The web server will capture the output of the script and send it to the web browser, while the script exits, content that it has once again served it's purpose... :)
Environment Variables
Certain environment variables are used to supply information to the script:
Variable name | Example | Description |
---|---|---|
AUTH_TYPE | basic | If the server supports user authentication, and the script is protects, this is the protocol-specific authentication method used to validate the user. |
CONTENT_LENGTH | 1024 | The length of the said content as given by the client. |
CONTENT_TYPE | application/x-www-form-urlencoded | For queries which have attached information, such as HTTP POST and PUT, this is the content type of the data. |
GATEWAY_INTERFACE | CGI/1.1 | The revision of the CGI specification to which this server complies. Format: CGI/revision |
HTTP_ACCEPT | text/html,image/gif,image/jpeg,*/* | The MIME types which the client will accept, as given by HTTP headers. Other protocols may need to get this information from elsewhere. Each item in this list should be separated by commas as per the HTTP spec. It happens to be one of the most poorly implemented headers of all time. Scripts shouldn't rely on this unless checking for specific types, e.g. WAP browser auto detection. |
HTTP_COOKIE | COOKIE1_NAME=COOKIE1_VALUE; COOKIE2_NAME=COOKIE2_VALUE | The Cookie: line sent by the client. This contains all the cookies set accessible to the script. (ie. valid for that domain, path, time and level of security, see cookies for further information.) |
HTTP_USER_AGENT | Mozilla 4.72 [en] (Compatible; RISC OS 4.02; Oregano 1.10) | The browser the client is using to send the request. General format: software/version library/version . |
HTTP_REFERER | http://www.iconbar.com/index.html | The Referrer: HTTP header sent by the user. |
PATH_INFO | ??? | The extra path information, as given by the client. In other words, scripts can be accessed by their virtual pathname, followed by extra information at the end of this path. The extra information is sent as PATH_INFO. This information should be decoded by the server if it comes from a URL before it is passed to the CGI script. |
PATH_TRANSLATED | ADFS::HardDisc4.$. WebDesign.WebJames.site.cgi-bin2 | A virtual path to the script being executed, used for self-referencing URLs |
QUERY_STRING | name=Mr+Jackson& postcode=GH20%2034FE | The information which follows the ? in the URL which referenced this script. This is the query information. It should not be decoded in any fashion. This variable should always be set when there is query information |
REMOTE_ADDR | 234.234.23.212 | The IP address of the remote host making the request. |
REMOTE_HOST | modem34.fsnet.co.uk | The hostname making the request. If the server does not have this information, it should set REMOTE_ADDR and leave this unset. |
REMOTE_IDENT | ??? | If the HTTP server supports RFC 931 identification, then this variable will be set to the remote user name retrieved from the server. Usage of this variable should be limited to logging only. |
REMOTE_USER | bob | If the server supports user authentication, and the script is protected, this is the username they have authenticated as. |
REQUEST_METHOD | GET | The method with which the request was made. For HTTP, this is "GET", "HEAD", "POST", etc. |
REQUEST_URI | /cgi-bin2/script?foo=bar | The full uri as requested by the user (i.e. SCRIPT_NAME [+ '?' + QUERY_STRING]) (Non standard?) |
SCRIPT_NAME | /cgi-bin2/script | A virtual path to the script being executed, used for self-referencing URLs. |
SERVER_ADMIN | webmaster@host.com | The email address of the server administrator |
SERVER_NAME | www.host.com | The server's hostname, DNS alias, or IP address as it would appear in self-referencing URLs. |
SERVER_PORT | 80 | The port number to which the request was sent. |
SERVER_PROTOCOL | HTTP/1.1 | The name and revision of the information protcol this request came in with. Format: protocol/revision |
SERVER_SOFTWARE | Apache/1.3.20 (Unix) PHP/4.0.5 | The name and version of the information server software answering the request (and running the gateway). Format: name/version |
Code Examples
In Perl
This will accept either POST or GET requests and will output a list of all the variables and the time on the server when the request was received.
#!/usr/bin/perl
if ($ENV{'REQUEST_METHOD'} eq 'POST') {
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
} elsif ($ENV{'REQUEST_METHOD'} eq 'GET') {
@pairs = split(/&/, $ENV{'QUERY_STRING'});
} else {
print "Content-Type: text/html\r\n\r\n";
print "";
print "Error: Method $ENV{'REQUEST_METHOD'} not allowed";
print "";
";
print "Error: Method $ENV{'REQUEST_METHOD'} not allowed";
print "
exit;
}
foreach $pair (@pairs)
{
$pair =~ tr/+/ /;
($name, $value) = split(/=/, $pair);
$name =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
if (!$FORM{$name}) {$FORM{$name}=$value;} else {$FORM{$name}.=", $value";}
}
print "Content-Type: text/html\r\n\r\n";
print "\n\n";
print "\n";
print "Test CGI
\n";
$time = localtime;
print "Form submitted at $time\n\n";\n\n";
foreach $key (keys %FORM) {
print "$key = $FORM{$key}\n";
}
print "
print "\n";
Download the source
In PHP
This will do pretty much the same as the above, but more neatly and differentiating between POSTed data and GET data. It would be easy to expand it to display cookies as well (Using the HTTP_COOKIE_VARS array). while (list($var,$val)=each($HTTP_GET_VARS)) echo "$var = $val while (list($var,$val)=each($HTTP_POST_VARS)) echo "$var = $val
Submitted : echo date("H:i:s D dS M Y T") ?>Get variables
"; ?>Post variables
"; ?>
Download the source
In BASIC
I won't include an inline copy of the source on the page, but this is a simple cgi application that I wrote for our local intranet - it is a CD player interface which allows me to control the CD on my main Risc PC from any of the other machines on the office LAN. I include both a compressed and an uncompressed version of the source. It uses the listings used by Leo White's excellent to CDPlay application, if available. If not it will call the CD "Unknown CD" and simply number the tracks.
I can't find a copy of the specs for controlling the volume, otherwise I would include that as well.
Instructions
If you are using WebJames, drop the compressed version into a directory within the website which supports BASIC scripts (e.g.
in the default setup) and run !WebJames. Then point your web browser at the location of the script (e.g. http://localhost/cgi-bin2/cd
). The interface should be fairly intuitive.
Setting the system variable
to no
will disable webCD
Setting the system variable
changes the refresh rate of the clients. (e.g. setting a value of 20 will cause clients to refresh every 20 seconds. webCD defaults to 10 if the variable is unset)
Download the compressed BASIC (Tokenised)
Download the uncompressed source (Untokenised)
Related links
General
- CGI Programming FAQ
- Idiots Guide to Solving Perl CGI Problems
- CPAN (for Perl modules)
- cgic: an ANSI C library for CGI Programming
- The Common Gateway Interface Specification (CGI/1.1)
- Common Gateway Interface - RFC Project Page (Working proposals)
- PHP - (PHP Hypertext Processor)
- Apache (Free Linux/MacOS X/Windows/*nix clones webserver)
RISC OS Specific
- WebJames (Free RISC OS webserver with CGI support)
- PHP for RISC OS
- HTML3 (Macro inserter with Perl and PHP support)