Simple CGI programming
Or you are into web development and want to know why Python frameworks now use WSGI instead of CGI. Some things become clear just by understanding how the web worked earlier.
Or you are just a web geek and want to spend an hour on old but interesting technology.
CGI (Common Gateway Interface) is an interface between the web server and the executable program. It is used to send the user inputs from the webserver to the executable program and return the program's output to the user once the program runs.
The user inputs or request data is sent to the executable program via ENV variables or STDIO. The executable can use this to generate the response by printing it to STDOUT. This STDOUT is sent back to the user as a response. It's that simple.
CGI is simple. Its simplicity is both a strength and a weakness. Its weakness are well known, but main ones are
- Security - It doesn't have any support that modern web frameworks usually provide. So it's easy to make a mistake.
- It creates a process for every request, so load on the machine increases as the requests increase.
Both these are critical factors to consider before you start. With them in mind, let's code a simple cgi script to give a server time as a response to GET request.
We need a web server that has CGI. There are many options. But we will use a small but robust server called thttpd. It's FOSS. On its website, it's defined as
thttpd is a simple, small, portable, fast, and secure HTTP server.thttpd
thttpd supports CGI. Let's install it.
# Download wget https://acme.com/software/thttpd/thttpd-2.29.tar.gz # Extract tar -zxvf thttpd-2.29.tar.gz # Configure cd thttpd-2.29 ./configure make # make should create an executable called thttpd # You can move it to any folder on system path or run from this folder # Let's serve a static page mkdir web echo "Hello World!" > web/index.html ./thttpd -D -p 8080 -h 0.0.0.0 -d web # now go to http://localhost:8080/ in your browser
We are going to use the following configurations to start the thttpd
- cgipat = cgi executable file name pattern, so only files that match that pattern get executed. You can even limit it to a single file if you want.
- dir = chroorting directory
- logfile = path of log file
- pidfile = path of the pid file that stores thttpd process id
- port = port to start
- host = host to bind
We can store them in a .conf file and use it to start. Before we start the server, let's create a bash script that will respond to the time request.
- Request inputs come as ENV variable and STDIN
- STDOUT is sent as Response
- Response headers also sent from STDOUT. Headers are sent first
- The header block has to end with a blank line
- Content-type header is a must
Lets call this cgi script "current_time.cgi" and lets store it inside a scripts folder under our base web folder.
. ├── example.conf ├── thttpd.log ├── thttpd.pid └── web ├── index.html └── scripts ├── current_time.cgi └── index.html 2 directories, 6 files
Our example.conf configuration will look like this
dir=/home/thej/code/experiments/web logfile=/home/thej/code/experiments/thttpd.log pidfile=/home/thej/code/experiments/thttpd.pid port=8080 host=0.0.0.0 charset=utf-8 cgipat=**scripts/current_time.cgi
Our script "current_time.cgi" is simple and just prints datetime and text called "Hello World". It needs to be executable.
#!/bin/bash date=`date -u '+%a, %d %b %Y %H:%M:%S %Z'` echo "Content-type: text/html" echo "" date echo "<br>" echo "Hello World"
To run from your shell start the server and then go to http://localhost:8080/scripts/current_time.cgi
thttpd -D -C example.conf
There you have it. Your first CGI program.
If you still want to continue, try parsing user inputs from ENV or STDIO. See how difficult it is, given how unfriendly the internet is today. And suddenly how we will start appreciating the frameworks, we have today. That said, I still love the simplicity of CGI.