A deployment pipeline—“run tests, check coverage, build container, deploy to staging, verify health”—is a knowledge item. The AI discovers it, executes it, adapts it. If Step 3 fails, the AI knows how to handle it because it has context from prior failures. A hardcoded orchestration framework would call a failure handler. The AI improvises—because it has the knowledge to improvise.
The most powerful multistep execution isn’t a chain of specialized agents. It’s a single model with enough context to plan, execute, and recover—informed by everything it’s learned from every prior execution.
The Simplicity Test
Here’s my test for any AI development tool: Can new engineers start using it by typing what they want in plain English?
If they need to learn slash commands, configure agents, set up skills, or understand the orchestration model, the tool is pushing its complexity onto the user. That complexity exists because the tool doesn’t know enough to handle the request naturally.
One model that remembers beats 47 that don’t.