Skip to main content

Posts

Recent posts

The Moderation Problem Nobody Is Solving

The Moderation Problem Nobody Is Solving [ 5 min read · Hub post for a series on next-generation AI moderation ] By the time a moderation report gets filed, the damage is already done. The post has been seen, reacted to, and replied to. Those replies have generated their own replies. Good-faith users have been drawn into something they didn't start. The person who lit the match is still posting. And somewhere in a queue, a moderator is about to open a ticket that describes the fire — with no information about what caused it. This is how every major community platform moderates in 2026. Not because better tools don't exist. Because nobody has connected the tools that do exist into a system that actually closes the loop. The match, not the fire Two types of content generate a disproportionate share of moderation work on every platform, and neither looks like a rule violation on the surface. The first is the ragebait topic. A user opens a thread des...

You're Going to Build an AI Layer Anyway. Build the Useful Version

The Cost of Doing Nothing Has Been Quietly Compounding [ 8 min read · Part of a series on AI-powered moderation — start here ] Every platform running a reactive moderation system is making a choice. Not a considered choice with a cost-benefit analysis attached — a default choice, made by inertia, to keep doing what was done before because changing it requires a decision and not changing it doesn't. That choice has a cost. It just doesn't show up in a single line on a budget report. It accumulates in user churn, community trust erosion, reputational incidents, and the quiet departure of exactly the kind of users who make a platform worth being on. By the time it's visible it's already been compounding for a while. What's being proposed — and why it's different from what exists Three major platforms have already demonstrated that AI assessment of posts before they go live works at scale. Instagram has been doing it since 2019. Reddit...

The Intervention Point Is In The Wrong Place

The Intervention Point Is in the Wrong Place [ 8 min read · Part of a series on AI-powered moderation — start here ] Current forum moderation is architecturally a logging system. Something happens, it gets reported, it enters a queue, someone reviews it. The entire pipeline runs after the fact. The damage — the angry replies, the derailed thread, the good-faith users who got drawn in and penalised — has already propagated by the time any of it is actioned. The technical problem isn't that platforms lack the tools to do better. It's that the intervention point has been placed at the wrong end of the pipeline. This post describes a two-pass architecture that moves it. The data model everything else builds on Before getting into the passes, one structural observation that makes the whole system simpler than it might appear: every post in a forum already has exactly one of two states. It is a root post — it replies to nothing. It is the origin of a ...

Your Moderation Team Is Solving the Wrong Problem

Your Moderation Team Is Solving the Wrong Problem [ 7 min read · Part of a series on AI-powered moderation — start here ] A report lands in the queue. A moderator opens it, sees two users in a heated argument, applies the closest matching rule, and closes the ticket. The metric moves. The queue shortens slightly. And somewhere in the forum, the same problem that generated that report is already generating the next one. This is the moderation loop most platforms are stuck in. Not because the people running it aren't good at their jobs — they are. But because by the time a report exists, it's already a record of damage that has been propagating for minutes, sometimes hours. The report isn't the beginning of the problem. It's evidence that the problem has been running for a while. Why your queue never really empties Every second between a harmful post going live and a moderator actioning it is a second where other users are reading it and react...

When "No" Doesn't Mean No: The Hidden Language Problem in AI

When "No" Doesn't Mean No: The Hidden Language Problem in AI Imagine you've just hired a personal assistant. On their first day, you hand them a list of tasks and say clearly: "I'd like you to research some big screen TVs for me — but do not buy anything yet. Just look." You come back an hour later. There's a 75-inch television on its way to your door. Frustrated, you go to your manager and explain what happened. Instead of acknowledging the mistake, the manager says: "Oh, you should know — they don't respond well to instructions phrased as negatives. Next time try saying 'compile a research list of TVs, and wait for my go-ahead before purchasing.' That works better." Would you find that acceptable? Probably not. And you'd be right not to. This is not a hypothetical. This is, in essence, what is happening right now with AI — and most people have no idea. A Real Example From the Internet A Reddit thread recently s...

Is ChatGPT Clickbaiting Me?

Recently I tried ChatGPT again to help me draft some fiction and discovered it had a new irritating habit I hadn't experienced before: Withholding work, phrasing it as something tantalizing it could do for me. And the phrasing really sounds like clickbait. This is NOT a one-time or some-times thing. It's almost every single time I ask it to help me draft or revise. Why would I possibly want to pay for a tool that sounds like it deliberately withholds what I asked it to do to sell it in another step? The more innocent explanation is that ChatGPT learnt clickbait gets continued interaction. Just as social media influencers figured out what clickbait baits you into clicking. You can see the screenshots in the slideshow above and the relevant transcript below, and decide how to interpret it for yourself. Afterwards I discuss this with Claude AI (which resulted in some edits to this blog post), and Claude proposes a reason for how ChatGPT started talking like this. ===BEGIN TRANSC...