Thomas Edison called his Menlo Park researchers "muckers." They weren't inventing through genius - they were trying hundreds of things to see what worked. Filament after filament. Failure after failure. Progress through persistent experimentation.

I grew up not far from that lab in New Jersey. The ethos resonates.

This blog is about mucking around at the frontier of AI-assisted software development. I'm an engineering director by day, where we're appropriately cautious - operating at what I'd call Level 3, with some Level 4 experiments on internal tooling. But on my own time, I'm pushing into wilder territory, trying to figure out what works before it matters professionally.

My coworker Petteri Valkonen called this "mulleting." Business in the front, party in the back. Careful at work, wild at home.

I'll be honest: until recently, I was a skeptic. Not of whether AI could write code - that was obvious - but of whether it would actually change anything. Most organizations don't trust their own people very much. Why would they trust an LLM? The bottleneck was never the technology. It was the org charts, the approval chains, the deeply held belief that quality comes from inspection and oversight.

What changed my mind was Chad Fowler's Phoenix Architecture blog series. I'd been influenced by his ideas on legacy avoidance and throwaway code for years, but the Phoenix Architecture posts crystallized something: if you've already internalized that code is cheap and should be disposable, then AI doesn't just make you faster - it makes that entire philosophy dominant. The economics shift from "write code carefully because changing it is expensive" to "write code disposably because rewriting it is nearly free."

Once I saw that, I couldn't unsee it.

I genuinely enjoy this. Tinkering with multi-agent systems at 10pm after my three-year-old is asleep scratches an itch that production code doesn't. But I'd be lying if I said there wasn't some lament mixed in. The experiments I run at night keep paying dividends at work. It'd be nice if that learning happened on the clock.

Edison got investors and a whole laboratory in Menlo Park. I get the hours between 10pm and midnight. You work with what you have.

I think most organizations aren't making space for this kind of exploration yet. That's understandable - there's real risk, real delivery pressure, real uncertainty about what's hype versus substance. But the frontier is moving fast. Leaders who don't find ways to let their engineers muck around are going to wake up one day and wonder why their competitors are shipping so much faster.

This blog is partly my attempt to make that case, one experiment at a time.

The Five Stages of AI-Assisted Development

Here's how I see the current landscape. The productivity numbers below are vibes, not science. I'm not going to pretend I've measured this rigorously. But directionally? I stand by them.

Level 1: The Intern

GitHub Copilot circa 2022. It suggests the next line, sometimes helpfully. You're still writing all the code - the AI just occasionally finishes your sentences. Marginal productivity gain. If you're still here, you're missing the real shift.

Level 2: The Senior Engineer

Now you're pair programming with a capable partner. The AI writes functions, you review them carefully. It explains code, suggests fixes, handles boilerplate. Real gains - let's call it 1.5-2x on good days. Most engineers who "use AI" live here.

Level 3: The Tech Lead

You're working like a manager of code. You write detailed stories and requirements, review plans, and keep an eye on the stream of code to ensure quality. You use AI for targeted assistance - running multiple review passes optimized for different concerns (security, readability, edge cases, gaps). The AI does research, helps debug, drafts implementations. You stay close to the code but delegate more.

This is where responsible production use lives today. Productivity is maybe 2-4x, depending on the task and your tolerance for risk.

Level 4: The CTO

You stop looking at the code.

Instead, you build verification systems - comprehensive tests, type checking, linting, acceptance criteria - and trust the artifacts, not the implementation. If the tests pass and the types check, you ship it.

This requires thinking in systems rather than code. It's how executives are already supposed to think about software, but now applied to AI output at the implementation level. The people you hear about running five Claude Code sessions in parallel? They're here. Productivity is... I don't know, 10-20x? Maybe more? You can build in a day what used to take months.

The catch: your verification layer better be airtight. And who builds that layer?

Level 5: The Mad Scientist

The AI runs verification. You verify that its verification strategy is sound and that it's actually doing its job. You're not checking code, or even test results - you're checking that the system of checking works.

This is where multi-agent orchestration comes in. Steve Yegge's Gas Town is worth exploring if you're curious - it's an attempt to build exactly this kind of industrialized coding factory.

This is where the crazy ones are. I'm trying to get there.

But the tooling is only part of the problem. The harder challenge is trust calibration - knowing when to trust the AI's judgment about its own output.

No one's figured this out yet. Honestly, it's not even easy to do with humans managing humans.

Level 5 is genuinely unsolved. It's where I spend my hobby hours mucking around.

Why This Blog

Levels 2 and 3 aren't that hard to learn. A week of focused practice and most engineers get there. Level 4 takes more - you need new mental models, and you need to build them through repetition.

Level 5 is pure R&D. Edison territory.

Here's the thing about Edison's mucking: the first-order effects were obvious. He killed gas lighting and created General Electric. But those were rounding errors compared to what came next. Electricity didn't just replace candles - it made cinema possible, and radio, and computing, and the entire modern world. The nth-order effects dwarfed anything Edison could have predicted.

I think we might be in a similar moment. The obvious story is "AI makes developers more productive." And that's true, but it's also boring. The interesting question is: what becomes possible when creating software is 10x or 100x cheaper and faster? What new things get built that nobody's imagining yet?

I don't know. That's why I'm mucking around.

But I'll tell you what I do think is coming: a bloodbath of creative destruction.

Most organizations are not prepared for this. They're still structured around inspection - code reviews, approval chains, manual QA, managers whose job is to coordinate and check work. That made sense when changing code was expensive and risky. It makes less sense when code is disposable.

The shift is toward statistical quality controls, not inspection. Toward flattened hierarchies, because you don't need as many people to coordinate when one person can do what a team used to do. Toward processes and job descriptions and career paths that don't exist yet.

People worry that engineers will face massive unemployment. I'm not so sure. I think it's possible the real displacement hits middle management - the coordinators, the process people, the ones whose value was making sure the work got done correctly. When the work gets faster and verification gets automated, what exactly are they doing?

This isn't a prediction. It's a question I'm asking out loud.

If you're an engineering leader: the organizations that figure out how to operate at Level 4 and 5 - and how to let their people experiment toward it - are going to have a serious advantage. Not just because they'll ship faster, but because they'll stumble into possibilities that the cautious ones will never see.

Making space for that experimentation is hard. I get it. But it's worth finding a way.

In the meantime, I'll be here after bedtime, mucking around.