In any publicly-accessible forum, spam and trolling are a huge issue, consuming most administrator-hours and requiring a larger staff than would otherwise be required to run a large site. In comparison, sites that charge for admission, and sites that cannot be joined without an invitation have close to no administration overhead beyond maintaining the code, servers, and resetting forgotten passwords.

Some public sites (Stack Overflow, Slashdot, and Reddit come immediately to mind) have implemented a form of reputation. This allows other users to know at-a-glance if the person they're dealing with is stupid, full of shit, or fuckin' nuts. The goal is usually to keep track of troublemakers, but this system can also be used to find the best users on your site. These are people who probably want to help you, given the opportunity.

Here's an idea: pair the "report" function of a site with another reputation, one that is a ratio of number of posts this user has reported to how many of them were valid1. For users that frequently spot problems with a high level of accuracy, it's safe to take action when a few of these high-reputation users flag the same issue independently instead of making the issue wait in a queue until a moderator can get around to it.

This sort of merit-based promotion of volunteer moderators could be automated and invisible to non-administrators, the high-reputation users not knowing their own augmented abilities, just that the site is a better place for their effort.


1 in 2011 torrez implemented this as a third-party application for Twitter in the form of Later, Spam!

this blog generates its pages by shell script, and it's been a bit of a challenge to make the engine portable, so generated content and the code itself can be run anywhere.

one way this is made easier is by using <base href="<?= $blogroot ?>" />, which tells the user agent to prepend every source and reference with $blogroot. as I developed on Safari, everything worked.

then I asked for some friends to look at it, and it turned out that the base tag doesn't quite work like I'd hoped: you can't use a relative path as your base href, and Firefox enforces this strictly.

there's two options for fixing this: the blog engine itself has to know where it lives on the server, so it can populate the base href correctly, making generated content less portable -or- the output generation could be smarter and append the relative base href to the beginning of every source and reference it sees, making the output code messier.

luckily, if you're ok with requiring that your clients support javascript (which I am), there's a third option. behold:

<script type="text/javascript" id="base_href">
  Node.prototype.insertAfter = function(newNode, refNode) {
    if(refNode.nextSibling) {
      return this.insertBefore(newNode, refNode.nextSibling);
    } else {
      return this.appendChild(newNode);
    }
  }

  var newcontent = document.createElement('base'); 
  newcontent.href = document.baseURI.substring(0, 
    document.baseURI.lastIndexOf('/')) + '<?= $blogroot ?>';

  var here = document.getElementById('base_href');
  here.parentNode.insertAfter(newcontent, here);
</script>

put that in the beginning of your <head> and it will emit the correct absolute base href at runtime.

thanks to Christian Hammond for the idea.

edited on friday february 19th, 2010 at 21:38:

unfortunately this blog no longer uses this hack because the base href was interfering with document bookmarks.

edited on friday april 9th, 2010 at 15:26:

this post used to feature a version using document.write() that didn't work in Opera, IE, and Konqueror. while I haven't extensively tested this new version, I expect it to work better.

the PHP library makes me sad sometimes, like earlier tonight.

I've been working on the blog engine, trying to figure out how to get various bits of markup working that I don't want to have to write by hand every post1, and it requires doing XML manipulation in PHP.

if you take the time to look around on the internet (and I did), you'll see a lot of people who want to manipulate HTML or XML in PHP, and all the replies recommend things like SimpleXML, which can't remove or change tags, or DOM, which also didn't work for my needs2. exhausting these options, every thread ends with "use str_replace" or "use preg_replace". sigh. what's the point in having these libraries if they're not actually useful?

I want to do this right, damnit, and I'm going to use a tool that parses my pseudo-HTML fragments into a tree and allows me to add, change, or remove tags at will!

luckily, as I was resigning myself to writing a library from scratch, I discovered simplehtmldom. unlike other libraries I've tried using recently, simplehtmldom worked right out of the box with no issues whatsoever. code example follows.


1 especially footnotes
2 DOM only accepts properly formed XML or HTML with an all-containing root node and will only create output with a doctype and a single root node. I wanted something that would create output as close as possible to the input.