modified: Sunday 27 June 2021
Why I recommend CGI instead of web frameworks
Originally written on lobste.rs back in 2018
A comment about CGI in general: it's absolutely beautiful.
When I wrote my own site backend a few years back I had no knowledge about the world of interfacing webservers with code. I discovered that there were many, many methods and protocols that each webserver only seemed to support a subset of. And many people telling me that CGI was old and bad and that I shouldn't use it.
I had a nightmare getting non-CGI things to work. I had no experience and background here, so not much of it made sense. I had presumed 'FCGI' was a "fixed" version of CGI, but didn't succeed at getting it to work after following a few guides and trying a few different webservers. I gave up.
I decided I should do the opposite of modern advice. I tried CGI. And I was immediately hooked by its simplicity.
- No dependencies or libraries
- Works with any programming language that can print and read text
- Supported by "most" webservers out there (*cough* everything but nginx)
For those not in the know, a fully working CGI script is as simple as this:
#!/bin/sh printf 'content-type: text/html\n\n' printf '<strong>Greetings Traveller</strong>' printf "<p>The date and time are $(date)</p>"
The webserver itself handles all the difficult bits of the HTTP headers. You just need to provide a content-type and then the page itself. Done.
If you want to provide more (cookies, etc) you can; it's just one more 'printf' line and you're done. No libraries, no functions, no complexity. You don't even have to parse strange constructs. Just print.
If you want to look at URL strings (eg for GET) you just need to be able to access the environment variable QUERY_STRING. If you want to access body data (eg for POST) you just need to read input. Just as if someone was sitting there typing into stdin of your program.
It does get ugly for complex or multipart POSTs. That's where a library or program can help. But you only need to attack that once you get there.
Compare this to every other method of talking to a webserver out there:
- No dependencies or libraries.
- Works with any programming language ever that can read text and print text.
- Zero external config other than telling your webserver to enable CGI on your file
A related story of teaching
A few months back I was helping some students with their website project. They were new to web development and had been recommended to use flask, a python library that acts as a webserver and webserver interface all in one. They were having extreme difficultly wrapping their heads around many concepts. Notably:
- Serving of static files (like stylesheets)
- Cookie handling
- Mapping of URLs to files, functions and directories.
Many of their problems stemmed from them not knowing how HTTP worked in the first place, so I was teaching them this. What made this process horrible was then also trying to find out how and then explain how flask abstracts these concepts into its own processes and functions. I could understand how to beginners like them it seemed completely opaque.
They thought pages were unreadable objects generated by the templating code, and that the templates themselves were sent to the user's browser along with the page. They thought cookies were handled and stored by the webserver as well as the client. The way flask's functions worked and the examples they followed suggested this to them.
If I'm ever in the situation again of helping new people learn web technology then I'm going to get or convert them to use CGI right off the bat. It's easier to teach, it's easier to understand, easier to get working on most webservers and isn't locked in to any particular language or framework.
The only downside of CGI that I know about is the fact it starts a new process to handle each user request. Yes that's a problem in big sites handling hundreds or thousands of visitors per second. But by the time a new student gets to running a big site they will have already encountered many, many other scalability issues in their code and backend/storage. Let alone teaching them database and security concepts. There's a reason we have quotes like "premature optimisation is the root of all evil".
I don't think students new to webdev should be started on anything other than CGI. They can use any language they want. They can actually understand what they're doing. And they're not hitting any artificial barriers or limits set by frameworks or libraries.
The whole idea that "CGI should be dead" makes little sense from my context and point of view. I run my own site, help maintain a few others and try to assist others in learning and coping with webdev.
I think the "CGI should be dead" makes sense only in the context of very high workload sites. Whilst these handle a large percentage of the web's total traffic, the percentage of people actually running these sites is small. Different units: traffic of visitors vs people running sites. I think we confuse them.
It's too easy to get caught up in "professional syndrome", where you look up to the big players and trust in their opinions. But you also need to understand that their opinions are based on their current experiences, which are often a world away from what the rest of us should be worrying about.
If a captain of a battleship says that cannons are his biggest problem then you shouldn't try to learn about and use cannons to build your first ship. You should then realise only a tiny fraction of ships need them, even the really big ones
Related: Time safety is more important than memory safety