| |

How I Built an Automated WordPress Post Reviewer (Dead Links, Stale Content, Signal Alerts)

I run a personal blog and, like most people who’ve been publishing for a while, I have a graveyard of older posts. Some have dead links. Some reference AI models or tools that have since been superseded. Some just deserve a second look.

I built a small automated tool to handle this — a WordPress post reviewer that runs on a schedule, picks a post, checks its links, flags stale content, and sends me a summary on Signal. Here’s how it works and how I built it.

The Problem

Content rot is real. A post that was accurate and well-linked in 2023 might have three dead URLs and a reference to “the latest GPT-4 model” by 2026. Most bloggers never go back and check. I wanted a system that would do it for me automatically, without requiring me to remember to do it.

The requirements were simple:

  • Pick a post automatically (with smart weighting — never-reviewed posts get priority)
  • Check every external link for 404s and timeouts
  • Have an AI review the content for stale quotes, outdated stats, and obsolete references
  • Send me a summary I can actually act on
  • Track which posts have been reviewed so nothing gets skipped indefinitely

How It Works

The system is two parts: a Python script and an AI agent pass, connected via a cron job in OpenClaw.

Part 1: The Python Script (wp_post_reviewer.py)

The script does the mechanical work:

  1. Fetches all published posts via the WordPress REST API
  2. Picks one using weighted randomization — posts that have never been reviewed get a 3× weight bonus; older posts get an additional age bonus. This means neglected posts rise to the top without the selection feeling mechanical.
  3. Checks up to 20 external links using HTTP HEAD requests, flagging anything that returns 400+ or times out
  4. Logs the review to a local wp_audit_log.json file so it knows what’s already been seen
  5. Outputs structured JSON — post metadata, link results, and content — ready for the agent to consume

Part 2: The AI Review Pass

The JSON output gets passed to Claude Sonnet, which reviews it for:

  • Dead links — with exact URL and HTTP status code
  • Stale content — time-anchored language (“recently”, “the latest model”, “last year”), specific outdated statistics, references to tools or products that have changed
  • Revival potential — could this post be updated and re-promoted, or is it fine as-is?
  • Tags and categories — are they still accurate?

The agent quotes specific text rather than giving vague summaries. If something is stale, it tells me exactly which sentence.

Part 3: The Signal Summary

The result always arrives as a Signal message — even if everything is clean. A typical report looks like this:

📋 Weekly Post Audit
"AI Models Compared: March 2025 Edition" • Published 2025-03-12 • ~400 days old
https://adrianhensler.com/ai-models-march-2025

🔗 Links: 8 checked, 1 dead
  ❌ https://openai.com/blog/gpt-4-research → 404

📅 Potentially stale:
  ⚠️ "GPT-4 is currently the most capable model available" — paragraph 2
  ⚠️ "Claude 3 was just released" — paragraph 5

🔄 Revival?
  Strong candidate — update the model comparison table and republish.
  Core structure is solid; just the specifics have aged.

🏷️ Tags/Categories: ✅ Appropriate

📌 Overall: Update recommended

Bugs I Hit During Build

Two bugs showed up during the first real run, and they’re worth documenting because they’re easy to miss.

Bug 1: days_old Always Returned 0

The script was calculating post age by comparing a naive datetime (from WordPress’s date field, no timezone) to datetime.now() which is timezone-aware. Python raises a TypeError on that comparison, the try/except caught it silently, and the fallback was days_old = 0.

Fix: switch to WordPress’s date_gmt field (UTC) and compare to datetime.now(timezone.utc). One line change, but it was invisible until the output looked wrong.

Bug 2: Link Checker Found Zero Links

The link extractor used a regex for href="..." attributes. It worked fine for recent posts. But my older daily briefing posts used plain-text link format — Link: https://... — not anchor tags. The regex correctly found zero href attributes, because there weren’t any.

Fix: add a secondary regex pass for bare URLs in the content. Both patterns now run and results are de-duplicated.

The Schedule

The whole thing runs as a cron job in OpenClaw — isolated, on its own schedule, separate from the main session. It fires a few times a week, picks a post, runs the review, and sends the Signal message. I don’t have to think about it until something lands in my messages worth acting on.

What I’d Add Next

  • A “fix this post” command that opens a draft with suggested edits already applied
  • Tracking which specific URLs have been flagged, so recurring dead links surface across multiple posts
  • A monthly digest of everything reviewed and what actions were taken

For now, it does what I needed: it keeps an eye on the archive so I don’t have to.


The script runs on OpenClaw, a self-hosted personal AI gateway. The post selection, link checking, and AI review are all handled locally — no third-party content scanning service involved.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *