Getting rid of old spam is hard

Updated November 17, 2021 – My spam remover extension, which uses the Akismet service (may require paying Akismet to use the service) can reliably remove a lot of legacy spam.


phpBB is now pretty good at keeping spam users from registering and posting. In the phpBB 3.0 and 3.1 days, its defenses turned out to be pretty weak. The GD Image spambot countermeasure (still the default) was easily hacked. phpBB has at least added settings to let you tune it better, making it harder to hack. It also started supporting Google’s reCAPTCHA, but the version in phpBB 3.1 was quickly hacked and phpBB was not agile enough to quickly integrate its versions 2 and 3 reCAPTCHAs.

This led to the an inundation of spam on certain forums, mostly bogus spam registrations but also lots of spam posts in some forums. Some administrators countered by requiring all new users to be approved by an administrator. But when inundated with hundreds of these in a short period of time, it’s a hassle to delete them all, or discern the real new users from the spam ones. For a few years, I made quite a bit of money removing spam for clients.

With phpBB 3.2 things slowly got better, at least if administrators used best practices. Best practices were to use reCAPTCHA version 2 “I am not a robot”, or the Question & Answer, providing the questions were sufficiently difficult. A malicious human could still take the time to solve the questions, but these were unusual. There were also a few extensions that could help. The Sortables CAPTCHA was one of the more useful ones.

My go to for years has been the Cleantalk extension, which requires subscribing to their service. But now there is also an Akismet spam extension, which also requires a subscription, which can be free for personal sites.

All this is good at preventing spam, but how do you get rid of months or years of spam posts? That was my dilemma this week working with a client.

The latest version of the Cleantalk extension has a feature that removes spam users and their posts. But I discovered it has a few serious limitations:

  • It bases its judgment based on the IP of the poster. The user’s last IP is stored automatically. It doesn’t examine the post text. Over time, IPs that used to be marked as spam get cleaned up, and when this happens these IPs are no longer flagged, so spam registrations aren’t caught.
  • Its interface for finding these users is slow and can easily time out, which means sometimes it can’t succeed. It also lacks pagination.

Why this particular client ignored this problem for a few years, I don’t know. But cleaning up the database was a big challenge. The only real way to do it is to manually look at every post and flag those that were spam, then use moderator tools to get rid of them.

This was time prohibitive, but if these could be removed presumably people would start posting again and Google would rank the site as legitimate again, bringing in new people.

The next best solution I found was to try to identify when the spam started. This took quite a bit of analysis, but looking in the most posted forum on the board it looked like it started on Feburary 10, 2018. So I used phpBB’s Prune User feature to remove users and their posts that registered after the spam started.

This seems to have gotten rid of the spam. But it also removed accounts of some legitimate users, and their posts as well. Those who had accounts before then were unaffected and their posts remained.

phpBB needs a real solution which so far doesn’t exist.

But I think I have found a solution … if I write the extension. If you are an extension developer, please go ahead and develop it, just tell me so I don’t waste my time.

It turns out that the Akismet, the biggest solution out there and used widely in WordPress to moderate comments, has a Submit Spam API. So in theory, if you pass the needed information to it including the poster’s IP and the post text, it can render a judgment on whether it is spam or not. If these posts can be flagged, they can then be removed.

One possible issue is that the service requires sending it a User Agent string. phpBB does not store this. Perhaps a fake user agent string could be supplied, but would this render a correct judgment? If no, this solution wouldn’t work. Also, it requires an Akismet key to use, which might require some boards to purchase the key. This may be a limiting factor for some.

As I have time I hope to see if this is a viable approach finding and removing spam posts in phpBB.

phpBB 3.2 Rhea, first look

I’ve been waiting for the dust to settle to study phpBB 3.2 (Rhea). It is scheduled for release on January 7, 2017. So I finally installed a prerelease version with presumably almost all the bugs fixed. Here’s my first look:

New features

There’s not much new or sexy about phpBB 3.2 compared with phpBB 3.1, but it depends on what you are looking for. New features include:

  • Support for emoji in posts. You can cut and paste or simply type your own emoji shortcuts into posts and the emoji will render. You can find a comprehensive list of emoji shortcuts here. For example, in a post you can enter :grinning: and a scalable grinning emoji should be rendered.
  • Supports PHP 7.1. PHP 7 is a quantum leap in speed for the PHP script processor. Most sites can expect a 100% improvement in how quickly PHP will parse and render code written in PHP.
  • Global announcements are no longer an administrator only privilege.
  • FontAwesome support. FontAwesome allows scalable vector fonts and icons, controlled by cascading stylesheets. For example, if you have a FontAwesome icon of an airplane, the icon will scale to size as you increase magnification on the page without losing detail. In addition, FontAwesome allows the size, color and shadow of the font to be changed on the fly using CSS … no jQuery magic required anymore.
  • New installer. This is backend stuff. Installing phpBB looks a bit different, and looks spiffier. Before I could it install, however, I first had to run PHP from the command line to kick off a run of PHP’s composer software. Composer is used to fetch the third party libraries that phpBB uses, presumably to get a current version of these libraries. Previously they were bundled into the phpBB archive you downloaded. It’s unclear to me if this is something you will have to do when phpBB 3.2 is released before it is installed. If so it will prove an obstacle to many casual forum administrators, since they may not be familiar with working from a command prompt and it may not be an option on shared hosting. The installer’s command line interface has also been reworked. I have not yet investigated what’s new here.

Other changes of note

  • The default prosilver style looks a little bit darker, and the icons have been reworked and look a bit different, and are seamlessly scalable because they will use FontAwesome.
  • No subsilver2 support. Someone developed a subsilver2 style for phpBB 3.1 but it was not responsive (scalable for mobile devices). With 3.2 only responsive styles are supported. subsilver2 uses HTML tables to layout content, which is not responsive, hence it is not supported.
  • The reCAPTCHA spambot countermeasure has been updated to use Google’s latest (presumably the checkbox where you assert you are a human). The old one had been hacked, so this is encouraging. Perhaps it will be useful as a spambot countermeasure again.
  • New events. This is only of interest to extension authors. They have more places in the code and in templates to hook in additional functionality.
  • BBCode overhaul. TextFormatter has been integrated into phpBB to render BBCode, making a lot of longstanding BBCode related bugs go away.

Some cautions

  • You should first upgrade phpBB from 3.1 to 3.2 before you change your web host control panel to use PHP 7. (Note: if you have other PHP applications installed, make sure they can handle PHP 7!)
  • You must run at least PHP 5.4 if you want to run phpBB 3.2, so this may require a web host control panel change. Make this change before upgrading phpBB.
  • While most 3.1 extensions will probably work fine in the 3.2 architecture, some will require changes if only to assert that they will work under 3.2. I have not tested my Digests and Smartfeed extensions with 3.2 yet, but I expect no issues. I will have to issue new versions since ext.php will have to allow phpBB 3.2 to be used.

phpBB 3.1 end of life support

  • The support forums on phpbb.com will provide support for phpBB 3.1 through the end of 2017.
  • New releases of phpBB 3.1 are expected as needed through July 2017. A phpBB 3.1.11 release is in the works.

Should you upgrade now?

In general it’s a bit dangerous to be first out of the gate when phpbb.com releases a new minor version of phpBB. Unless there is a compelling reason otherwise, I’d wait a few months before upgrading to 3.2 but if you use the prosilver style with no extensions it might be worth installing when available. Check your styles and extensions and make sure each supports 3.2 before upgrading, or be prepared to use a standard style and have incompatible extensions disabled.

If you would like me to upgrade you to 3.2 contact me. Most upgrades cost $30USD.