published: Thursday 5 September 2019
modified: Tuesday 27 December 2022
author: Hales
markup: html
modified: Tuesday 27 December 2022
author: Hales
markup: html
Minisleep
Minisleep - a tiny wiki engine done right.
Minisleep is genuinely small (<1000LOC), has an (optional) graphical page editor with drag-and-drop image support, statically compiles
pages to html files, needs almost zero dependencies, is simple to move from server to
server and is secure to external attack.
Minisleep is designed to avoid many of the pitfalls encountered by the author when setting up and using other wiki engines:
- Lists of perl/php/python dependencies. A wiki should not need 100's of MiB of code to display a few pages. A few KiB is enough.
- Complex default page templates. Templates should be easy to change: simple HTML and CSS with no knowledge of the dark arts required.
- High time & effort demands for maintenance, especially for fixing breakages after updates. A good wiki should help you save time, not take your time away. There is a reason you are not using Sharepoint.
- Complex security models. For better or for worse: many people do not know how to (or have the time to) implement and maintain full and proper security on their wikis, so many end up being spammed. The simpler the security model is the easier it is to prevent this.
Intro
Minisleep is written in POSIX shell. This is the lowest common denominator across most webhosts, so it should "just work" out of the box for most users.
If you want to try Minisleep then see the 5 minute quickstart section. This software is designed to be run from wherever it is extracted, no install process is necessary.
Minisleep can be used for almost anything:
- Project documentation
- Community wikis
- Corporate front sites (intended to only be edited by select staff)
- Personal websites & blogs
However, Minisleep does not have:
- A user signup form to create accounts. Accounts must instead be made by the administrator.
- Page namespaces. Folders may or may not do what you want instead.
- Conflict-resolution (for when two people edit a page at the same time). Last user wins.
- Table creation support in the WYSIWYG editor. In the works!
- Any form of warranty (see the GPL). This is free software from the internet, use at your own risk.
Downloads
2020-08-30: Version 1.20 (latest)
- Admin page: now provided by the cgi script instead of /misc/admin (as part of the new CSRF protections). If upgrading: you will need to update your links in buildpage.sh and you will probably want to delete your old admin page/folder.
- CSRF protection: added. Auto-generates a per-site key when first used, then works on hashes of this and the user's username.
- Config: added info on how to set URL_PUBLIC to make Minisleep your root URL site
- Config: updated HTTPAUTH_MANDATORY info
- Config: added default umask
- Added <head> around meta refreshes to improve compatibility with Chromium/Chrome
- Example server configs: added PDF mimetype to lighttpd conf
- Typo: replaced "foobar" in license texts
- action_edit: Fixed an unintentional leak of text to stdin (which broke Yaws compat)
- Editor CSS: minor changes
- Favicon now symlinked
- HTTP auth: updated to support the Litespeed HTTP server, commonly used by shared hosts.
- Page revisions: now a togglable feature, in case you don't want to store old page versions.
- Fixed incorrect use of % instead of %% when processing key=value pairs in GET strings
- Changed David the magician's robes to suit the default site colour scheme better.
- Some misc small fixes & changes.
- Changed the codebase from bash to POSIX sh. No longer requires bash.
- Renamed config (to config.ini) and removed bash-specific features.
- Removed the (unmentioned) hard dependency on flock. Now it's used only if available.
- Fixed the dependencies list. It was missing some coreutils, this may be important for people on non glibc-linux platforms.
- Heavily reworked documentation.
- Fixed WYSIWYG editor form losing focus when editor buttons are clicked.
- Added HTTPAUTH_MANDATORY option to the config, to provide a layer of protection if/when webserver configurations are accidentally changed. Otherwise such accidental changes could allow anyone to edit your Minisleep website.
- Moved the bad-cgi-implementation CONTENT_LENGTH workaround into the main minisleep script. (used to be in minisleep_lessgreey.cgi).
- Lots of other small fixes & changes.
- Fixed several typos in the documentation
- Worked around variables unintentionally being substituted in the documentation pages.
- Initial release
Dependencies
- Linux (likely works on other *nixes, ask for help if you have problems)
- A POSIX-y shell (tested: dash, bash).
- GNU coreutils or equivalents (head, tr, cat, cut, sed, realpath, touch).
- A HTTP webserver that supports CGI (tested: lighttpd, apache, hiawatha, Yaws, litespeed).
Optionally: flock (from util-linux) is used if it is available. This prevents race conditions that can cause page corruption if multiple people submit page edits at the same time. This is unlikely to be an issue for low-editor count sites; and if you use the web editor then page revisions are automatically backed up to a ds_revisions folder regardless.
GPLv3 or later, copyright William Hales 2019. Contact minisleep AT halestrom DOT net.
Features in detail
Several options to write pages
- Online interface: HTML WYSIWYG with drag-and-drop image support (no separate uploading necessary). Recommended for new users.
- Supports any markup to HTML converter of your choice: markdown, textile, mediawiki, bbcode, reStruturedText, uuencoded RTF, etc. Adding your own requires adding one line of code to build_page.sh
- Pages can be edited locally (with your favourite editor) and synced to your site using tools like git, rsync or unison. No additional complexity or config required.
- Optional feature (disabled by default): a page can be an executable script, where any text printed becomes the page. Useful for generating site indexes, such as front pages for blogs.
Lightweight: Almost zero dependencies
- Common commandline utilities and a HTTP webserver that supports CGI (eg apache, lighttpd, hiawatha, yaws)
- Ridiculously easy to setup and to move from host to host.
- No daemon, no database.
- Less than 1000 lines of code: designed to be understandable and fixable by non-experts.
Statically compiles pages: fast & resilient to failure
- All scripts can be disabled or fail and the site will still stay online. You can repair problems in your own time.
- Likely faster page delivery than any traditional dynamic wiki software (Mediawiki, Doku, etc).
Dumb and simple security
- HTTP authentication, handled by your webserver instead of this wiki's scripts.
- Documentation & examples are provided for several popular webservers.
- No notable attack surface if you keep your login credentials safe.
- Anti-CSRF tokens.
No database: pages stored as files and folders
- Dead simple to administrate and backup.
- Very easy to migrate page content to and from other wikis and systems, so you don't feel trapped.
Easy to theme
- Comes with one very short CSS file, rather than one filled with dark magic.
- One short script acts as the page template, intended to be edited & kept by users across updates.
Security model
Or "how can something using shell be secure against bugs?".
Minisleep is divided into two parts:
- Normal pages: what most users see. Static .html files in folders making up the public side of the wiki.
- Editor script: one special URL used to edit pages, secured with HTTP authentication.
"HTTP authentication" is an HTTP feature that brings dialogs like these up in your web-browser:
These HTTP auth challenges are sent and handled by your HTTP webserver, not Minisleep itself.
This means Minisleep has two notable attack routes:
(1) Unauthorised users/attackers
- Have full access to the public side of the website (static .html files).
- Are challenged to provide a username+password (via HTTP auth) if they try to access the editor script.
- The HTTP server will not forward any attacker requests to Minisleep's scripts until they succeed at this auth.
- Can perform CSRF against legitimate users (see next section)
(2) Authorised users/attackers (ie those with valid usernames + passwords)
- Can edit pages and insert malicious javascript/images/trackers/ads/etc. This is a problem faced by most wiki engines.
- Can exploit bugs in the editor script to run arbitrary code on the server.
Several defences have been put in place to mitigate this last problem, however Minisleep is not guaranteed to be 100% bug free. Your primary line of defense is to only give accounts to trusted people.
CSRF security
"Cross site request forgery" is where one website (controlled by an attacker) manipulates a user into sending a request to another site. This can allow attackers to perform any action on a site that a user could do (such as editing pages).
CSRF attack methods range from the simple (hyperlinks, forms) to the complex (javascript iframes and websockets). Sometimes clicking on a link in an email can be enough for a CSRF attack to work.
As of Minisleep version 1.20: per-website and per-user tokens are used to try and prevent CSRF attacks. Only simple (information) pages can be accessed without a token. Normal users should not even have to know tokens are being used.
This method of prevention may not be perfect, so comments are welcome. Notably the same token is re-used forever for a single user on a single installation of Minisleep.
5-minute quickstart
Minisleep comes with working example server configs that you can run in-place.
Option 1: lighttpd
Lighttpd's configuration files tend to be simpler than that of Apache.
1. Install lighttpd on your computer. Eg:
sudo xbps-install lighttpd # Void
sudo apt-get install lighttpd # Devuan, Debian, Ubuntu, Mint, etc
sudo yum install lighttpd # Fedora
2. Enter the folder 'minisleep/docs/lighttpd'
3. Try to run lighttpd with the provided config:
$ lighttpd -f lighttpd.conf -D
You may need to provide the full path of lighttpd, depending on your distro:
$ /usr/sbin/lighttpd -f lighttpd.conf -D
4. Point your web browser to http://localhost:8080/minisleep/
If you want to edit any pages: the username is 'david' and the password is 'magic'.
If you plan to use Lighttpd yourself then pay attention to:
- Enabling server.follow-symlink
- Inserting mod_auth and mod_cgi in the right order to avoid module-loading problems.
- Configuring page expiry, so that browsers don't keep old copies of pages cached.
Option 2: Yaws
"Yet Another Webserver" also has a nice config file format.
1. Install yaws
2. Enter the folder 'minisleep/docs/yaws'
3. Run yaws with the provided config:
$ yaws --conf yaws.conf
4. Point your web browser to http://localhost:8080/minisleep/
If you want to edit any pages: the username is 'david' and the password is 'magic'.
If you intend to use Yaws yourself then note:
- Yaws aggressively caches pages by default. You may have to wait up to 30 seconds before refreshing will show changed page contents.
Option 3: Hiawatha
Hiawatha is another easy to configure webserver with some really nice features and a long history. It's not in the Debian repos but many other distros package it.
Unfortunately Hiawatha's future is unclear. As of 2019 with the lead author has locked the forums and wants to scale down the project.
1. Install hiawatha
2. Enter the folder 'minisleep/docs/hiawatha/'
3. Get your current path using the 'pwd' command:
$ pwd
/home/valentine/library/code/minisleep/docs/hiawatha
4. Edit hiawatha.conf to reflect this path:
set START_POINT=/home/valentine/library/code/minisleep/docs/hiawatha
5. Run hiawatha with the provided config:
$ hiawatha -c . -d
6. Point your web browser to http://localhost:8080/minisleep/
If you want to edit any pages: the username is 'david' and the password is 'magic'.
- MaxRequestSize (for uploading page edits with lots of big images)
- Enabling FollowSymlinks
Full installation procedure
(1) Obtain a HTTP webserver that supports CGI. If you are on a
shared host then one will probably have already been setup for you,
otherwise I recommend you install lighttpd or apache (two of the most popular options).
Further down this document is the
section "Tip: Testing CGI" that will make your life easier.
(2) Choose two URLs for minisleep to use. One URL for all of
the normal static pages to be under and one special URL for the editor's CGI
script. Valid choices include:
http://example.com/minisleep/
http://example.com/minisleep.cgi
http://example.com/
http://example.com/cgi-bin/editor.cgi
http://example.com/bobs_barbarians/
http://example.com/cgi-bin/bruce.cgi
...etc...
Note: Many webservers only allow you to enable HTTP auth for folders, not files. This means you may have to put the CGI file into its own special folder (eg cgi-bin/).
(3) Download
and extract your copy of minisleep somewhere safe. Do not extract it
into anywhere that your HTTP server will serve (as you would with many php
websites). Instead keep it somewhere such as your home directory where
other people cannot get access to it.
(4) Edit your minisleep 'config' to reflect your chosen URLs:
export URLPUBLIC='/bobs_barbarians'
export URLCGI='/cgi-bin/bobs_barbarians.cgi'
(5) Update minisleep's pages to reflect the changes made to this config file, otherwise the links on the pages will be broken:
$ source config
$ scripts/rebuild_all_pages.sh
(6) Add two symbolic links between your minisleep setup and your web server's WWW directory to reflect your chosen URLs. Examples include:
# On a shared host
ln -s ~/minisleep/public ~/public_html/bobs_barbarians
ln -s ~/minisleep/scripts/minisleep.cgi ~/public_html/cgi-bin/bobs_barbarians.cgi
# On my own server
ln -s ~/minisleep/public /var/www/html/bobs_barbarians
ln -s ~/minisleep/scripts/minisleep.cgi /var/www/html/cgi-bin/bobs_barbarians.cgi
(7) Configure your webserver to allow following symlinks. Some disable this by default.
At this point your install of minisleep should be working. Try it out in your browser.
(8) Enable HTTP auth for the editor URL (so that people need a username+password to edit pages).
For apache and many shared hosts: you can enable this feature using a .htaccess and a .htpasswd file. See the documentation of your webserver/host for more details.
Example (working) configurations for several webservers are included in the docs/ directory.
(9) Setup TLS (https) so that you can access and edit your website securely. If you do not do this then it is possible for attackers to sniff and steal login credentials whenever you use them, especially if you are on an untrusted network (eg open wifi).
Lets Encrypt is a popular free service for obtaining HTTPS certificates and many shared hosting providers automatically set you up with a free certificate anyway.
Managing HTTP authentication credentials (users)
Things you do not need to do
Minisleep keeps pages as files and folders. There is little to Minisleep that isn't hierarchical, so a relational database is not really beneficial.Tip: Testing CGI
Managing HTTP authentication credentials (users)
The most common way of managing HTTP auth credentials is to use the 'htpasswd' utility. This tool "should" come with your HTTP server, but some distros only bundle a copy with apache. On Debian based distros it's separated into the apache2-utils package.
Use it like so:
$ htpasswd -c myauthfile.htpasswd bobuser # First time usage requries '-c' to create the file
$ htpasswd myauthfile.htpasswd another
$ htpasswd myauthfile.htpasswd thirduser
Htpasswd supports some better hash types than it's default of apr1 (a variant of MD5), but make sure your webserver actually supports them before you try to use them. I have found many webservers simply ignore what they don't understand.
Hiawatha comes with its own version of htpasswd called wigwam.
If you have troubles getting a copy of htpasswd then a shell-script imitation is provided in the docs/ directory. It requires an openssl variant to be installed (generally true for any Linux server these days).
If all else fails: many HTTP servers also support 'plaintext' passwd files like this:
bob:bobs password in the clear
mary:turduckinator 3000
admin:password
If you are on a shared host then this may be unwise, as there's a higher chance of someone finding a way to read your files and find your passwords. Generally speaking: avoid using plaintext passwords.
Things you do not need to do
1. Tell minisleep where it's installed.
The symlinks are enough. Minisleep works out the rest.
I wish more website engines did this! Most require you to hardcode their locations (into multiple files too). That's just silly, the computer can do this work for you.
2. Setup an SQL database.
Tip: Testing CGI
It's worth testing CGI with a simple script before trying to get Minisleep working.
Create a text file with the following contents:
#!/bin/sh
printf 'status: 200\n'
printf 'content-type: text/html\n'
printf '\n'
echo '<b> Moo said the cow </b>'
echo '<p> If you can read this then CGI is working. </p>'
Depending on your host & setup you will need to work out where to save it and under what name. Examples include:
~/public_html/cgi_bin/mytest.cgi # Common path on shared hosts
/var/www/html/mytest.cgi # Your own HTTP server.
If applicable: enable CGI for the relevant URL in your HTTP server's configuration. Examples for some webservers are in the docs/ directory, otherwise see your particular server's official documentation.
Make the script executable:
$ chmod a+x /var/www/html/mytest.cgi
Now browse to the relevant URL in your web browser. If everything is working then you will see:
If
instead you see the sourcecode to your script, are prompted to download
the script or get an error: CGI is not yet setup correctly.
Recommendation: Use https
(If you are running minisleep on your home LAN or in another controlled network then you can safely ignore this section).
Every
time you login to a website you will want to make sure your connection
is encrypted and secured. If it's not then people can steal your
username+password and do all sorts of naughty things to your website.
This isn't a problem unique to Minisleep, which is why the vast majority
of websites now support HTTPS.
Every
single HTTP server and environment has a different way of setting up
TLS/SSL/HTTPs. You also need to create or get a valid certificate -- as
of the time of writing lets encrypt is a very popular free service.
If you are on a shared host then they may be able to do this for you (and some do it automatically for free without asking).
Modifying buildpage.sh
Minisleep is split into two main scripts:
- scripts/buildpage.sh
- scripts/minisleep.cgi
The
buildpage.sh file is intended for users to edit and keep their edits
across updates of the wiki. The minisleep.cgi file on the other hand is
intended to remain unedited so that it can be easily replaced with
newer versions when updating.
buildpage.sh is not very long. You can skip all of the initial setup code in it and go right to the page rendering bits.
Adding support for your favourite markup language converter
Anything that can input a text file and output HTML will work.
In scripts/buildpage.sh:
# Any script/program/method that you want can be used to markup your pages into
# HTML. Minisleep comes with several examples included below, however they will
# probably need adjusting to meet you needs.
#
# Tips:
# - There are many different 'markdown' converters out there. If you use
# mardown then make sure to adjust the command line options below to match
# your variant.
# - 'pandoc' supports pretty much every format under the sun and is really
# convenient, but it's often in the form of a single >100MiB executable.
# - Stick to HTML if you're unsure, it requires no setup of external programs.
#
# 'script' is disabled by default, because it is suspected that many people do
# not like interfaces for executing arbitrary code on their server to exist.
case "$markup"
in
html) cp temp_pre temp_post ;;
html_bug) cp temp_pre temp_post ;;
markdown) markdown temp_pre > temp_post ;;
textile) pandoc -f textile -t html temp_pre > temp_post ;;
mediawiki) pandoc -f mediawiki -t html temp_pre > temp_post ;;
plaintext)
echo '<pre>' > temp_post
cat temp_pre | sed 's|<|\<|g ; s|>|\>|g ; s|'\''|\&apos\;|g ; s|"|\"\;|g' >> temp_post
echo '</pre>' >> temp_post
;;
#script)
# chmod u+x temp_pre
# . temp_pre > temp_post
# ;;
*) echo "Page generation error (buildpage.sh): unknown markup type '$markup'."
esac
Let's
say we want to add support for reStructuredText using a program called
'rst2html' from the Debian package 'docutils-common'. After installing this tool we can simply add the
following line to the mix:
restructuredtext) rst2html temp_pre > temp_post ;;
Done. Try it out by specifying 'restructuredtext' as the markup for a page when editing it.
Note: You may want to choose a more convenient-to-type name (like 'rst') instead.
Customise page appearance (aka templating) including top links
Minisleep does not use a separate template file, instead code and template are mixed together in buildpage.sh:
# ------------------------------------------------------------------------------
# -- Final Content Render
# ------------------------------------------------------------------------------
exec 1>index.html.temp
echo "
<!DOCTYPE html>
<html>
<head>
<meta http-equiv='Content-Type' content='text/html;charset=UTF-8' />
<title> $title </title>
<link rel='stylesheet' href='${URL_CSS}' type='text/css' />
<link rel='icon' href='${URL_FAVICON}' />
<meta name='expires' content='0' />
<meta http-equiv='pragma' content='no-cache' />
<meta name='viewport' content='width=device-width, initial-scale=1.0'/>
</head>
<body>
<header>
<div class='left'>
<a HREF='${/minisleep}/'>Home</a> |
<a HREF='http://www.autofish.net/'>Somewhere Else</a> |
<a HREF='https://libraryofbabel.info/'>Deeper</a>
</div>
<div class='right'>
<a HREF='ds_revisions/'>Revisions</a> |
<a HREF='${/cgi/minisleep.cgi}?action=getcontrols&path=${PAGEPATH}'>Edit</a>
</div>
</header>
<main>
<h1> $title </h1>"
cat temp_post
echo "</main></body></html>"
# Shift the now completed page into production
# The 'mv' step is added for atomicity
mv index.html.temp index.html
rm temp_pre temp_post
rm ds_lockfile
That's it. Nothing more. Compare that to some default templates provided by other wikis :D
You
will notice that single quotes ( ' ) are used instead of double quotes ( " ) in
the HTML. This is 100% valid HTML and makes it easier to avoid quoting
problems in the script, otherwise you have to write with slashes ( \" )
everywhere.
/* -----------------------------------------------------------------------------
* Default HTML constructs
* ---------------------------------------------------------------------------*/
body { margin: 0; font-family: Sans, Sans-Serif; }
h1, h2, h3, h4 { clear: left; }
pre
{
white-space: pre-wrap;
margin-left: 2rem;
}
img
{
height: auto;
margin: 0;
max-width: 100%;
padding: 0;
}
/* -----------------------------------------------------------------------------
* Main page components
* ---------------------------------------------------------------------------*/
main { margin: 1rem; }
header
{
background-color: #8A0000;
color: #AAAAAA;
padding: 0.3rem 0.5rem;
margin: 0;
overflow: hidden;
}
header a { color: white; }
header a:visited { color: white; }
/* -----------------------------------------------------------------------------
* Misc
* ---------------------------------------------------------------------------*/
.left { float: left; }
.right { float: right; }
#content h1:first-of-type {
clear: none;
margin-top: 0;
}
This
CSS is intentionally short and bereft of magic. You can either work from it or wipe it and start from scratch, nothing will break. The editor interface provides its own CSS, so you don't have to worry about harming the editor.
Background and discussion
Why was Minisleep written?
Many years ago I wanted to start
my own personal website and I
thought a wiki backend would be perfect. I had spent a lot of time writing
for a game project's Mediawiki site and I had grown to love the
wiki-style markup & features.
My adventures setting up my own wiki didn't go well. After trying several I felt some usability problem themes emerging:
My adventures setting up my own wiki didn't go well. After trying several I felt some usability problem themes emerging:
- Second class and limited markup support: a prime example is not being able to have multiple lines of text in a table cell.
- Complicated image upload procedures. How many steps does it take to get an image online? Per namespace or per page? Many failed the test of "is it actually easier to use an SFTP client like filezilla?".
- Extremely complex page templates. IF ELSE spaghetti.
(NB I've only solved some of this in Minisleep)
In particular I found the complexity of wiki projects was their biggest drawback. I was left feeling that "enterprise suitable" means "lots of space, time and effort required".
Some examples:
- Mediawiki
is a behemoth, with lots to do and go wrong during the setup process.
Great for big projects with spare hands, much more difficult to use for one person's personal site.
- Tiddlywiki has some cool concepts, but it uses lots of javascript and gets very slow for larger sites.
- ikiwiki
looked absolutely perfect, but when I tried to set it up on my cheap
shared host I had to fetch hundreds of megabytes of perl dependencies. It took me a few attempts to get it right and
it broke for me when the host updated.
Generally speaking: I wanted something with similar features but less effort. Life is too short to be spent dusting and oiling software.
Several years back I wrote my own backend called Darksleep that I use for my own personal site. Originally I had to SSH in or remotely sync files to make changes, but
eventually I added an online interface for page creation and editing, slowly morphing it into something more like a wiki.
Minisleep is a rewritten version of Darksleep with many fixes and changes based off what I have learned from operating and running Darksleep. Notably the commenting and submission throttling features have been strippled, but I plan to add these back at some point as optional features.
Minisleep is a rewritten version of Darksleep with many fixes and changes based off what I have learned from operating and running Darksleep. Notably the commenting and submission throttling features have been strippled, but I plan to add these back at some point as optional features.
Why is Minisleep written in shell instead of (eg) python, rust or go?
Shell
scripts are one of the lowest common denominators across all webhosts.
In particular: bash is popular enough that you're almost guaranteed to have it already; and if not then it's easy to get without needing special tools or an ore-train of dependencies.
Compiling executables to run on some hosts can be a pain. My current shared host bails any attempt at running (even static!) executables that I've compiled on other systems (such as Debian stable) and I'm not sure why. The hosting provider only permits access to their toolchain temporarily and only upon request.
My
experiences wrangling dependencies and runtime requirements for website backends drove me mad. A good tool does a lot with a
little, not the other way around.
Why CGI?
Context: CGI lets your scripts talk to a webserver, so site pages can be dynamic (made on the fly). There are infinite ways of doing this these days, with CGI now considered by many to be old or slow.
Here's a copy of an extensive answer to this question I wrote up on Lobste.rs a while back:
CGI is implemented inconsistency (and in silly ways): stdin is needlessly complicated
When a user accesses your CGI script they sent a HTTP message using their browser. This message is split into two parts: header and body.
In CGI you:
- Access the header through environment variables (eg $QUERY_STRING)
- Access the body by reading standard input (stdin).
Unfortunately different HTTP servers seem to disagree on exactly how to treat over-reading and under-reading of stdin. There are approximately three categories:
(1) Sensible & forgiving. Examples: apache, lighttpd
If you read too little stdin: no one minds
If you read too much stdin: no one minds, the read fails.
(2) Pushy. Examples: (some shared hosts?)
If you read too little stdin: the user is redirected to an error page.
This is annoying, but perhaps understandable. If all of stdin has not been read then perhaps your script crashed early. An easy workaround is to add something like "cat > /dev/null" to the end of your script.
(3) Confused. Examples: hiawatha, yaws
If you read too much stdin: your read calls hang forever.
This does not make sense to me. Why hang the read? The server knows there is nothing more to provide, making the script hang forever seems impolite. Worst of all the webserver then kills the script for taking too long. The only way out of this prison is to be a guard!
If you read too little stdin: no one minds
If you read too much stdin: no one minds, the read fails.
(2) Pushy. Examples: (some shared hosts?)
If you read too little stdin: the user is redirected to an error page.
This is annoying, but perhaps understandable. If all of stdin has not been read then perhaps your script crashed early. An easy workaround is to add something like "cat > /dev/null" to the end of your script.
(3) Confused. Examples: hiawatha, yaws
If you read too much stdin: your read calls hang forever.
This does not make sense to me. Why hang the read? The server knows there is nothing more to provide, making the script hang forever seems impolite. Worst of all the webserver then kills the script for taking too long. The only way out of this prison is to be a guard!
Minisleep's workaround
Minisleep spawns a second copy of itself, but with stdin sent over a controlled and well-behaved pipe.
if [ ! -z "${CONTENT_LENGTH:-}" ]
then
temp_conlen=$CONTENT_LENGTH
unset CONTENT_LENGTH
head --bytes "$temp_conlen" | scripts/minisleep.cgi
exit $?
fi
From a technical point of view: this solution is a bit inefficient. But hey, we're already using shell.
From a social standpoint: the problem itself is stupid. It turns CGI, an otherwise
great interface for learning, into something with dark traps.
Students and learners don't have a hope of finding out what is going wrong here
unless they already understand read calls and unix pipes.
Sed vs bash's in-built pattern substitution
Minisleep depends on 'sed' to HTML encode pages for the in-built editor (Sidenote: there are some curious and unexpected gotchas around this like UTF-7). I was curious to see if bash's pattern substitution (not a POSIX feature) could compete with sed, however it looks to be many orders of magnitude slower.
Example:
htmlencode()
{
sed 's|&|&\;|g ; s|<|\<|g ; s|>|\>|g ; s|'\''|\&apos\;|g ; s|"|\"\;|g '
}
htmlencode2()
{
a="$(cat)"
a="${a//&/&}"
a="${a//</<}"
a="${a//>/>}"
a="${a//\'/'}"
a="${a//\"/"}"
printf "%s" "$a"
}
$ time htmlencode < public/misc/docs/ds_raw > foo1
real 0m0.024s
user 0m0.014s
sys 0m0.009s
real 0m0.024s
user 0m0.015s
sys 0m0.009s
$ time htmlencode2 < public/misc/docs/ds_raw > foo2
real 0m1.755s
user 0m1.715s
sys 0m0.030s
real 0m1.759s
user 0m1.709s
sys 0m0.040s
$ diff foo*
103c103
<
---
>
\ No newline at end of file
Almost 100 times slower, taking it from almost-imperceptible to annoying :(
In practice it can still be faster, but only if called many times over on small pieces of data. This isn't the use case here.
Why HTTP_AUTH?
It's
damned simple compared to cookie-based auth methods and it prevents
unauthorised requests ever making it to the site scripts. The less
surface area you have the less you have to worry about. See the
'Security' section above.
Isn't CGI slow?
Context: many alternative communication interfaces have been made over the years that claim to be faster. CGI suffers the problem of needing to start a process for every user request.
- No. Webapp latency is a complex topic, server interfaces are only one part of the puzzle.
- CGI is only used in Minisleep when you edit pages, not when you view them.
Minisleep is the fastest site engine I have ever used. This is not because of choice of external technologies, it's because of design.
Sidenote: slowdowns in Minisleep tend to be dominated by uncached disk reads. Notably on
shared hosts the first edit will take a while to load but after that
things will be snappier. Traditional daemon-style website engines work around this by permanently hogging resources, there are ways of doing the same for Minisleep (if you are interested).
Bugs & Plans
WYSIWYG editor:
- Forms inside pages get destroyed when the editor is enabled.
- Dragging and dropping some types of files onto the editor (such as videos) embeds them in a broken manner (browser bug?).
- No table editing support (beyond copying+pasting them in and editing them from there). This used to exist in Firefox and was really nice, but it is now gone.
- No image resizing support. This used to exist in Firefox and was really nice, but it is now gone.
Clicking on a formatting button removes focus from the typing area.(Update: still sometimes happens, eg after creating an empty H2 in Firefox)- Needs key shortcuts. Ctrl+B for bold, etc.
- Needs more features (
text colour,boxes, floats, tables, etc) - 'Unformat' button only removes some types of formatting (not <pre>, <ul>, etc).
- Makes horribly ugly HTML code. Optional htmltidy (or similar) support?
- Editor is not at the same path as the pages themselves, so relative-pathed materials (images, video) do not show in the editor.
General:
- Documentation: info on cache control & page expiry.
- Feature: Page deletion (move deleted folders to a safe place).
- Feature: User management (add, delete, suspend, etc)
- Username detection fails if you use a HTTP_AUTH type other than basic (eg digest)
- Feature: Automatic table of contents generation. ie a simple script pass that looks at <h2>, <h3> etc tags and prepends some extra HTML.
- Feature: split inline images off into actual files (to make future page edits faster & easier)
More elegant solution to the read(stdin) hang problem of some HTTP servers (see minisleep/scripts/minisleep_lessgreedy.cgi).- Errant newlines are added to the end of a page every time it is edited (HTTP post quirk)
Debatable:
- HTTP keywords (POST, GET) are probably not used correctly (as inteded by the standards that few people follow). PUT and some others might be worth considering.
Limitations (things that probably won't be changed or fixed):
- Gradated access control (different user roles)
- Namespaces - ie places where different users have different rights. (If really needed: run another copy of minisleep?)
Contact
Please send all of your comments, suggestions, workarounds, stories, bugreports, code and complaints to: minisleep AT halestrom DOT net. If you have read this far down the page then you should shoot me an email saying Hi.