Home | Giveaways | Rules and About
Admin | Edit
published: Thursday 12 April 2018
modified: Sunday 22 April 2018
author: Hales
markup: textile

Meta: Blog comment systems

Ruben Schade is asking for advice about adding a comment system to his blog.

You fool! I'm only too willing to give it.

Disqus

The beast. Disqus is comments as a service. Sign up, add some javascript to your pages and you have it.

Here's an example from disqus' own blog :

Pros:

Cons:

None of this ever agreed with me. Ignoring the issues I see as a person running my own site: I've wanted to write comments onto other people's blogs before and actively been stopped due to my issues with disqus. Not all people are the same, but for low read-volume blogs like mine I care about every commenter.

Stories of self hosting

My site is statically generated and has comment support. Many people will look at the comment posting interface and scream internally -- it looks easy to spam and people can impersonate each other. There's only a very, very weak captcha and no login/auth system. It sounds strange and vulnerable, but there's a good story here.

When I originally made the site I wrote a whole user registration system, complete with a million sanity checks and email verification links. To take a corporate view on things it was a HUGE success -- I didn't have one ounce of spam!

It turns out people didn't like having to register just to comment on a blog. They give up halfway through. My logs said so.

I tested my system regularly to make sure registration worked (and it did). Only one comment ever made it, and it was from a friend I forced through the registration process. Even then he wasn't happy he couldn't set his name to start and end with 'X_X'.

Stepping back

My system was a gulag. I had gone full-bore in my own direction without a care for the users.

Of course they didn't want to sign up. It breaks their train of thought (they just want to write a comment!) and puts them into an interrogation chair. What is your mother's maiden address? Where did you hide the [body]?

What could I do?

Well, when in doubt, steal other people's ideas.

Introducing Irrlicht3d.org by Nikolaus Gebhardt. It has a commenting system that looks like this from the outside:

There's a minor bit of anti-bot verifications (a single hard-coded word). Two of the four fields at the top are optional. And then you just write your comment, that's it.

Wouldn't this be vulnerable to mass spam attack? I decided the best thing to do was try it myself here as an experiment. If it failed then I could always pull the plug.

I'm still running that experiment today:

Status:

Aren't you afraid of abuse?

It's not hard for someone to write a script, hardcode my captcha in and try to spam or attack my site. My current system only prevents 'dumb' bots that randomly fill fields on every site.

I've had to take down other sites (such as wiki backends) before because of spam attacks. It's always fun to see CPU usage on hosts pegged at 100% because a new wonder drug is being blogged+about+on+the+frontpage every 30 seconds. It's even better to look back at the site's history and discover this has been happening for weeks or months. Poor server. From what I've seen there are many "forgotten" sites on the web either being spammed or completely exploited/infected.

Experiencing this has made me calmer about the whole situation. I know what it looks like and why people do it.

Below are three strategies. At the moment I'm only using the first one, the others are future routes I'll take if I need to.

1. Throttling

Only a certain amount of comments can be posted to my site per day and per week. I track this both by IP and through a global counter, so even a distributed spam attack can only post X comments before any more are blocked.

Pros:

Cons:

2. Option: Moderation

Currently all comments get automatically published. If I have to I can change this to a moderated system where I have to give comments a tick before they appear.

Pros:

Cons:

3. Possibility: better captchas

This is a whole other story.

I think there are ways of rolling my own without having to store anything locally on the server through some clever use of one-way hashes. I might actually try writing one of these -- a single .cgi file implementation that does not require any local storage would be amazing.

Pros:

Curious implementation detail: no database

This site uses no databases other than the filesystem itself. Every comment is a folder, like these ones:

~/darksleep/public/blog/010_distrohop_p2 $ ls ds_comments/*

ds_comments/1476087963_27865:
author  content  url

ds_comments/1476251459_29803:
author  content  url

ds_comments/1483907474_24753:
author  content  url

ds_comments/1484129284_26265:
author  content  url

...

All user-provided data is stored in the files themselves instead of the filenames; to prevent abuse. The foldernames are just the current time (seconds since epoch) plus a random number to avoid collisions. A simple sort command gets them in order.

You don't need a database until you're dealing with thousands and thousands of comments; and by then your site would probably be big enough to warrant the hassle. Until then: don't throw databases at problems your filesystem will happily solve.

A site that never was

I once had the idea of making the site static with zero on-server scripting. Commenting would be done through emails to a specific address, and a script on my local/desktop would read them. New pages would then get pushed to the real webserver.

This would work on even completely static free hosts (ones that allow no CGI or similar). My old ISP still offered something like 64MB of space just for this. I wonder if anyone has ever done this before.

Closing remarks

This site is statically generated. I think it's an absolutely brilliant (and easy) idea. Ignoring the speed benefits, it means the site stays up if I disable the commenting system/script.

In Ruben's case: his site is statically generated and uses a version control system. I'm not sure which order and how this is setup (he might generate on his personal computer and then push via vcs), so it might be inconvenient for him to go down the road I have.

Suggested solution: write a small .cgi script that handles accepting comments and generating .html files containing nothing but them. Then [iframe] or similar them in to your main pages, so you don't have to modify your main pages (or touch your vcs) when a comment gets added.

I've written my backend in bash, because it's really really easy to handle files in. Admittedly it's a little hard to keep things secure -- for instance you have to use 'printf' instead of 'echo' to echo untrustworthy data -- but it's simple and fast. CGI lets you use any language you want, and I'd recommend giving shell a go.

I have the urge to write a portable system anyone can run themselves and embed in their pages, but I don't have the time at the moment. This week had seen four lots of assessments and me getting behind in other work. I chose a good week to try and get my site back together.

Ruben: I'm good at writing long pages and making things seem complicated. Try making your own system.

Hint: html forms with some [input type='text] and then a [textarea] last makes some very easy to process (with any language) output.

I'm happy to help out, ask me questions about any problem, I probably thought of it at some point too :)

I'd also love to hear other people's opinions on this, if you're still reading.


EdS - Monday 16 April 2018

Just noting that you do have at least one reader! But you probably know that from your logs. With forum signups, there may be bots but there also seem to be humans who are probably paid a pittance to get past the captcha and create accounts for later abuse by spammers. Staying under the radar is a good start.

Hales - (site author) - Monday 16 April 2018

Hey Eds!

I only regularly look at logs specifically for my site's CGI (interactive) components. The normal page visits are spammed to a massive degree by bots of all types so it's hard to comprehend things there.

On that note:

2018-04-16T00:57:20+1000 n (x.x.x.x) main: attempting action 'comment_add'
2018-04-16T00:57:20+1000 n (x.x.x.x) fail user: action_comment_add: authorname too short
2018-04-16T00:57:32+1000 n (x.x.x.x) main: attempting action 'comment_add'
2018-04-16T00:57:32+1000 n (x.x.x.x) action_comment_add: author 'EdS' posted comment to '/blog/030_comment_blog_systems/'

Woops. I think I should relax that restriction. I presume you're Ed, not Eds? :P


> With forum signups, there may be bots but there also seem to be humans who are probably paid a pittance to get past the captcha and create accounts for later abuse by spammers

I remember reading something somewhat related to this once. The idea of putting more trust into the users that register. It was by someone who operated a paste-any-html style site that wanted to combat their site being used for abuse.

They introduced a signup system, and even paid tiers, in the hope of removing or reducing the spammers. They then found out it was the spammers who were most likely to sign up for the accounts :)

> Staying under the radar is a good start.

There's a few ways I can look at this and I'm not sure which one you mean. Avoiding publicity?

I thought long and hard about making this post -- whether or not discussing spam problems and the particular ways you could do it on my site could lead to spam -- but I settled on the belief that's it's better to share problems than hide them. I think people should be prepared and understand, rather than be afraid.

Hales - (site author) - Sunday 22 April 2018

Ruben's reply: https://rubenerd.com/feedback-on-static-comments/

I'm glad you had more people than just me sharing ideas with you. From my POV, as a reader of your blog that occasionally sends you an email, I have no clue about how many other people actively do the same. No tumbleweeds, just chasms.

> The main downside there is people may not be enthusiastic about commenting if they either need their own blog to link back to mine,

If you are referring to my comment system: the "URL" field is completely optional. The only mandatory components are the Name, AntispamWord and CommentBody. You don't need a blog of your own to reply here.


> DW: For the love of god, DON’T DO BLOG COMMENTS!
> DW: [..]
> DW: People shitpost you for everything and think they are clever. It’s so tiring. Fuck that.

Definitely be prepared. But don't be afraid. It's your comment system, your universe. If people come to your universe to try and stuff you over, then they obviously don't know you're in control of the laws of physics here.


Add your own comment:

Name:
Email (optional):
URL (optional):
Enter the word 'irrlicht' (antispam):
Leave this box blank (antispam):

Comment (plaintext only):

If you provide an email address: it will only be used for the site admin to contact you, it will not be made public.

If you provide a URL: your name will hyperlink to it.