Building a Podcast-Generation MVP in OpenClaw: from question to dual-voice POC

Interesting to note perhaps, that I did all of this – including this post – from my cellphone on the bus ride home. I wanted to see whether I could build a practical “research-to-podcast” workflow inside OpenClaw without over-engineering it. This post documents what I did, what worked, what broke, and what I would improve next.

Goal

Build a portable OpenClaw skill that turns a source memo into a podcast-style script, then render a dual-voice audio MVP.

What I built

Skill: skills/research-podcast/
Phase 1: script generation (cost-aware model routing)
Phase 2: dual-voice stitched audio proof of concept
Demo input: pfSense vs OPNsense on Intel N305 review memo

Abbreviated Build Conversation (Edited for Clarity)

Below is a concise, faithful summary of the back-and-forth that led to the current MVP.

Adrian: Can this be built as a reusable skill, with cost-aware model choices?
Assistant: Yes. Start script-first, then add audio/TTS as a second phase.
Adrian: Use the pfSense/OPNsense N305 review as the demo input.
Assistant: Implemented. Initial script generated.
Adrian: Make it less rigid Q&A and more conversational (still professional).
Assistant: Prompting updated to use narrative flow with key decision questions woven naturally.
Adrian: Compare cheap/value/premium models and summarize.
Assistant: Ran matrix across nano, 4o-mini, Kimi, GLM, Sonnet, and mixed modes. Found response-shape issues on some models and patched the parser accordingly.
Adrian: Which model choices stood out?
Assistant: Kimi performed well on value; Sonnet gave the best narrative quality; 4o-mini outline + Sonnet expansion was a strong quality/cost compromise.
Adrian: Generate audio examples.
Assistant: First pass was single-voice (not ideal). Second pass implemented dual-voice host segmentation and stitched output.
Adrian: Document the process in a WordPress draft with artifacts.
Assistant: Draft created, then refined for tone, formatting, and readability. Matrix and script sections were further polished and corrected.

Testing process

I ran a model matrix across cheap, value, mixed, and premium configurations. Early tests exposed a response-shape issue: some models returned reasoning-style fields instead of clean content, which broke naive parsing. I patched the parser to handle those payloads and reran the matrix.

Model matrix (cleaned)

Profile	Outline	Expand	Status	Approx length	Takeaway
cheap_nano_nano	gpt-5-nano	gpt-5-nano	Pass (after patch)	~1850 words	Very cheap, but noisy reasoning artifacts risk
cheap_4omini	gpt-4o-mini	gpt-4o-mini	Pass	~550 words	Clean but often too short
cheap_kimi	kimi-k2	kimi-k2	Pass	~900 words	Best value/quality balance in this run
cheap_glm	glm-5	glm-5	Pass (after patch)	~850–1550 words	Useful but verbosity varied
mixed_nano_kimi	gpt-5-nano	kimi-k2	Pass (after patch)	~580 words	Cheap and concise, mixed quality
mixed_nano_4omini	gpt-5-nano	gpt-4o-mini	Pass (after patch)	~620 words	Cheap and concise
premium_sonnet	sonnet-4.6	sonnet-4.6	Pass	~1490 words	Best narrative quality, highest cost
mixed_4omini_sonnet	gpt-4o-mini	sonnet-4.6	Pass	~1487 words	Near-premium quality at lower cost than full premium

Decisions I made

Value mode: Kimi K2
Higher-quality mode: 4o-mini outline + Sonnet expansion
Guardrail: avoid raw reasoning-heavy outputs unless sanitized

Audio MVP and correction

My first TTS pass was wrong for the target result (single voice instead of podcast style). I corrected that with a dual-voice pipeline: parse Host A/Host B turns, synthesize each turn with distinct voices, insert pauses, then stitch to one MP3.

Artifact: dual-voice audio example #2

▶ Listen to the audio sample:

Download MP3

Script used for audio example #2 (full, untruncated)

# pfSense vs OPNsense on Intel N305

## A Practical Decision Guide for Home and Prosumer Setups

---

## Cold Open

**Host A:** If you're running a security-focused home or prosumer network, you've probably already looked at pfSense and OPNsense. Both are serious platforms. Both run fine on modern mini-PC hardware. But "both work" isn't a decision — and today we're going to make one.

**Host B:** Specifically, we're looking at the Intel N305 platform. We'll cover which firewall OS fits better right now, what the real migration risks are if you're moving from pfSense, and what a low-risk pilot actually looks like before you touch production.

**Host A:** We'll also give you concrete pass/fail criteria and a rollback requirement. Not suggestions — requirements. Let's get into it.

---

## Section 1: The Hardware Baseline

**Host B:** Start with the N305 itself. It's an Intel Alder Lake-N chip — eight efficiency cores, no performance cores. It's not a powerhouse, but for firewall work it's more than adequate.

**Host A:** Right. For baseline routing, DHCP, DNS, NAT — the N305 has significant headroom. The more interesting question is VPN. And here the N305 holds up reasonably well because it supports AES hardware acceleration. That matters for IPsec and OpenVPN cipher performance.

**Host B:** WireGuard is the stronger story on this class of hardware. It's lighter by design, and you'll generally see better throughput and lower CPU overhead compared to OpenVPN at equivalent tunnel configurations.

**Host A:** OpenVPN can still bottleneck on single-thread performance depending on cipher and settings. That's not an N305 problem specifically — it's a known OpenVPN architectural constraint. Worth knowing before you plan your VPN workload.

**Host B:** So the CPU question is mostly settled. The N305 is sufficient. The more important hardware variable is the NIC — and we'll come back to that because it's actually the bigger risk factor in this whole decision.

---

## Section 2: Platform Comparison

**Host A:** Both pfSense and OPNsense are BSD-based firewall distributions. They share a common ancestor — m0n0wall, then pfSense, then OPNsense forked from pfSense in 2015. So the underlying architecture is similar, but the projects have diverged meaningfully since then.

**Host B:** pfSense's strengths are real. It has a long-running ecosystem, extensive community documentation, and a large base of enterprise and hobbyist deployments. If you're looking for a conservative, well-documented platform with years of forum answers, pfSense has that.

**Host A:** The tradeoff is release cadence and governance. Netgate controls pfSense development, and the release pace is more conservative. Some features that are now standard in OPNsense took longer to appear or required workarounds in pfSense.

**Host B:** OPNsense moves faster. WireGuard is built in natively. The reporting and traffic insight tooling is more developed out of the box. The UI is generally considered cleaner for iterative work — adding rules, adjusting VLANs, reviewing logs. For someone who's going to be in the interface regularly, that matters.

**Host A:** Governance is also different. OPNsense is fully open — Deciso develops it but the project is open-source without the licensing friction that pfSense has had. That's a practical concern if you care about long-term tooling independence.

**Host B:** For a security-focused setup where you're iterating frequently — tuning IDS rules, adjusting VPN configs, testing new features — OPNsense is the better default candidate right now. That's our interpretation, not a hard fact, but it's grounded in the current state of both projects.

**Host A:** pfSense isn't a wrong answer. If you're already running it, know it well, and have a stable config, the case for migrating has to be stronger than "OPNsense looks nicer." We'll get to that calculus.

---

## Section 3: The NIC Risk — This Is the Real Variable

**Host B:** Let's talk about the NIC question, because this is where people get surprised. The CPU is fine. The OS is capable. The thing that actually causes instability in BSD firewall stacks is often the network interface hardware.

**Host A:** Specifically: Realtek NICs. Realtek drivers on FreeBSD — which is what both pfSense and OPNsense run on — have a documented history of inconsistency. Not always broken, but not reliably stable under sustained traffic, and the driver quality varies by chipset generation.

**Host B:** Intel NICs — the i210, i225, i226 series — are the standard recommendation for stable production behavior on BSD firewall stacks. If your N305 box has Intel NICs, you're in a much better position. If it has Realtek, that's a risk you need to validate explicitly, not assume away.

**Host A:** This is a caution note, not a certainty. Some Realtek configurations run fine. But if you're building or buying hardware for this purpose, Intel NIC is the right spec to require. It removes a variable that's hard to debug after the fact.

**Host B:** So before you commit to any migration plan, confirm the NIC chipset in your target hardware. That single data point changes the risk profile of the whole project.

**Host A:** And firmware matters too. Current firmware on Intel NICs, especially the i225/i226 which had early errata, is important. Don't assume a new box shipped with current firmware.

---

## Section 4: Migration Risks and What to Inventory First

**Host B:** If you're moving from pfSense to OPNsense, the top risks aren't philosophical — they're config translation and plugin parity.

**Host A:** The two platforms don't share config formats. You can't export a pfSense backup and import it into OPNsense. Everything has to be rebuilt or manually translated. That's not a dealbreaker, but it means you need a complete inventory before you start.

**Host B:** The migration worksheet should cover four areas: interface mapping — WAN, LAN, VLANs, physical port assignments. Rules and NAT — every firewall rule, every NAT entry, documented and mapped to OPNsense equivalents. VPN configuration — all tunnels, certificates, peer configs. And package or plugin equivalence — every pfSense package you're using needs a verified OPNsense equivalent or a documented replacement plan.

**Host A:** That last one catches people. Some pfSense packages don't have direct OPNsense equivalents. Some have equivalents with different configuration models. You need to know this before cutover, not during.

**Host B:** The other migration risk is behavioral differences. Even when features exist on both platforms, defaults and behaviors can differ. DNS resolver settings, DHCP lease handling, traffic shaping — test these explicitly, don't assume parity.

**Host A:** And downtime planning. A staged cutover with a real rollback path is the only acceptable approach here. We'll cover what that rollback plan has to include.

---

## Section 5: The Pilot Plan and Pass/Fail Criteria

**Host B:** The pilot structure is straightforward, but the sequencing matters. Don't try to validate everything at once.

**Host A:** Phase one is core network policy only. Stand up the test node — either dedicated hardware or a maintenance-window clone — and replicate just the essentials: WAN and LAN interfaces, VLAN assignments, DHCP, DNS, NAT, and your primary firewall rules. Nothing else yet.

**Host B:** Get that stable first. Run it for a few days with real traffic if possible. Watch for interface flaps, DHCP failures, DNS resolution issues. These are your baseline indicators.

**Host A:** Phase two adds VPN. Bring up your WireGuard or OpenVPN tunnels, verify throughput against your actual workload targets — not synthetic benchmarks, your real use case. Latency and packet loss matter here, not just raw throughput numbers.

**Host B:** Phase three adds IDS/IPS. This is where CPU load increases meaningfully. Enable Suricata or Zenarmor, apply your ruleset, and verify that latency and packet loss stay within acceptable bounds. IDS/IPS can introduce real overhead — validate it doesn't degrade your primary traffic.

**Host A:** Then the soak period. Seven days minimum with metrics collection: uptime and reboot count, NIC link stability — no unexplained flaps, CPU and temperature under sustained load, VPN throughput and latency, and DHCP/DNS stability for all primary clients.

**Host B:** The pass/fail criteria should be explicit before you start the pilot, not evaluated after. Here's the list: No unexplained interface flaps under sustained traffic. Stable DHCP and DNS for all primary clients throughout the soak period. VPN throughput within your defined acceptable range for real workloads. IDS/IPS enabled without unacceptable latency or packet loss — define "unacceptable" for your environment before you test. And a verified, tested rollback path before any production cutover.

**Host A:** That last one is the gate. If you haven't tested the rollback, you haven't completed the pilot.

---

## Section 6: The Rollback Plan and Final Decision Framework

**Host B:** The rollback plan isn't optional and it isn't just "keep the old hardware around." It needs to be specific and tested.

**Host A:** At minimum: your pfSense config backup must be current and verified restorable. Your pfSense hardware must be in a known-good state — powered down but ready, not repurposed. Your rollback RTO needs to be defined — how long can your network be down if the cutover fails? And someone needs to have actually run through the restore process, not just assumed it works.

**Host B:** If your rollback RTO is under an hour, you need the old hardware physically available and the restore procedure documented step by step. If it's longer, you have more flexibility, but the backup verification is still mandatory.

**Host A:** On the final go/no-go decision: if you're on pfSense, stable, and not hitting any specific limitations — the migration cost is real and the benefit needs to justify it. Wanting a cleaner UI isn't enough on its own.

**Host B:** If you're building new, or if you're hitting pfSense limitations — slower feature availability, governance concerns, specific plugin gaps — OPNsense on Intel-NIC N305 hardware is the right default choice for a security-focused setup right now.

**Host A:** If your NIC situation is uncertain, resolve that first. Everything else is secondary to hardware stability.

**Host B:** And if you're mid-evaluation with unknowns — exact NIC chipset unconfirmed, plugin equivalence unmapped, rollback untested — you're not ready for production cutover. Complete the inventory, run the pilot, validate the criteria.

**Host A:** That's the framework. Let's close with the action list.

---

## What to Do Next

- **Confirm your NIC chipset** before anything else — Intel i210/i225/i226 is the target; Realtek requires explicit validation and current firmware check.
- **Build a migration worksheet** covering interface map, firewall rules and NAT, VPN configs, and package/plugin equivalence with verified OPNsense counterparts.
- **Stand up a test node** and validate core network policy — interfaces, DHCP, DNS, NAT, firewall rules — before adding VPN or IDS/IPS.
- **Define pass/fail criteria explicitly** before starting the pilot: acceptable VPN throughput, maximum tolerable latency with IDS/IPS enabled, zero unexplained interface flaps over seven days.
- **Verify your rollback path** — restore a pfSense config backup to known-good hardware and confirm the procedure works before any production cutover.

---

*This episode is based on an initial decision memo. Final go/no-go should follow a completed config inventory and pilot validation — not this conversation alone.*

**Host A:** Thanks for listening. The goal is simple: pick the platform that matches your operating style, then prove it with a controlled pilot.

**Host B:** Exactly. Validate, measure, and keep rollback ready. Until next time, stay secure.

What I would improve next

Add stricter sanitization rules for reasoning-style responses
Log exact per-run input/output/cache cost in a machine-readable report
Improve dual-voice quality (voice model selection, pacing, prosody)
Add pronunciation dictionary for technical terms
Add mode presets (cheap/default/premium) as first-class CLI options
Package as one-command end-to-end workflow

Bottom line: this is a functional MVP/POC that now works end to end, with clear quality and polish improvements queued.

Building a Podcast-Generation MVP in OpenClaw: from question to dual-voice POC

Goal

What I built

Abbreviated Build Conversation (Edited for Clarity)

Testing process

Model matrix (cleaned)

Decisions I made

Audio MVP and correction

Artifact: dual-voice audio example #2

Script used for audio example #2 (full, untruncated)

What I would improve next

Daily Briefing — 2026-06-30

Text to speech automation via python

Daily Briefing — 2026-06-17

Daily Briefing — 2026-03-31

Daily Briefing — 2026-05-22

Daily Briefing — 2026-05-09

One Comment

Leave a Reply Cancel reply

Goal

What I built

Abbreviated Build Conversation (Edited for Clarity)

Testing process

Model matrix (cleaned)

Decisions I made

Audio MVP and correction

Artifact: dual-voice audio example #2

Script used for audio example #2 (full, untruncated)

What I would improve next

Similar Posts

One Comment

Leave a Reply Cancel reply