Monday, 12:00. Grace, the CEO of ACME Corp, just finished her Q2 leadership meeting. The team decided it is time to build an integration with a major platform. An AI agent picks up the meeting summary and creates the first tickets. Grace's VP of R&D approves them over Slack and kicks off development.
There is nothing in this scenario that cannot be done today. This is the new baseline for a healthy SDLC. So how do you get there?
01 / The New Age of SoftwareFrom human-as-worker to human-as-decider.
By now it is clear the nature of software engineering is undergoing a fundamental change. For decades, the industry focused on making humans more efficient at doing their job. We are now entering an era where the primary "workers" will be autonomous agents, and the primary role of the human is to make decisions.
This is not speculative. Top-tier engineering organizations are already retooling for this. Stripe is deploying Minions for one-shot end-to-end coding tasks. Ramp built background agents to handle the repetitive toil that usually slows teams down. We see companies like Cursor moving toward cloud agents and a broader industry push toward background agents where models are upstream, constantly proposing changes, and our pipelines are downstream, acting as the high-fidelity filters for quality.
The data supports this. A METR study from early 2025 shows AI agents already performing at the level of experienced open-source developers on complex tasks. Industry leaders like Andrej Karpathy and Michael Truell point to a future where programming is less about syntax and more about managing these agents. As Boris Cherny put it, the shift is already here.
The software development lifecycle has always been a series of handoffs. A ticket becomes a plan, a plan becomes a design, a design becomes code, code becomes a release. At each step, context is transferred from one person or team to the next, and some of it is inevitably lost. AI agents are about to change the mechanics of these handoffs. Not by replacing the people involved, but by carrying context forward more reliably, doing the repetitive parts faster, and freeing humans to focus on the decisions that actually require judgment.
We are still in the very early days and engineers are ahead of the rest of the business, but this is about to change. This article proposes a framework for thinking about where agents fit, what triggers them, what context they need, and critically, where humans stay in the loop. It is opinionated where it should be (humans in the loop, handoffs, feedback loops) and flexible where your organization needs it to be (choose your own tools).
One pipeline. Six phases. Human gates where judgment lives.
Click any phase to explore its agentic workflow. Vertical flow is forward progression. The horizontal pills mark feedback loops where work can return upstream.
02 / Feedback LoopsWork does not always move forward.
The pipeline above reads top to bottom, but real work does not always move forward. Feedback loops are how the system self-corrects. Some loops are short and frequent, happening many times within a single ticket. Others are long, spanning the entire lifecycle when monitoring discovers something that needs new work.
Short loops, within a ticket
These happen constantly and are often fully automated. They are the inner engine of quality.
Long loops, across the lifecycle
These close the full cycle. They are the reason the SDLC is a loop and not a line.
The key design decision is which feedback loops should agents handle autonomously, and which require human judgment. A good default: short loops within Build–Validate (test failures, lint fixes) can be automated. Everything else should notify a human and wait for a decision.
Each phase inherits everything the last one made.
Move the slider through the phases. New artifacts produced at each phase are highlighted. By the time you reach Release, the agent needs access to everything upstream.
Context accumulation across phases
03 / Context AccumulatesThe hardest unglamorous problem.
One of the hardest problems in the agentic SDLC is context management. Each phase produces artifacts that downstream phases need. By the time you reach Release, an agent needs access to the ticket, the plan, the design artifacts, the code changes, the test results, and the compliance evidence.
The practical approaches to managing this context range from simple to complex. Most teams should start at the simple end.
Simple. Agents access tools directly (Jira API, Google Docs, Git) and pull context on demand. A summary of prior phases is passed as part of the agent's initial prompt. This works for most teams and keeps infrastructure minimal.
Advanced. A shared state store (vector DB, knowledge graph) where each phase deposits structured artifacts. Agents query semantically for relevant context rather than receiving everything. This becomes necessary when the volume of artifacts exceeds what fits in an agent's context window.
Start with tool access. Let agents pull what they need from the systems that already hold the information. You can always add a shared state layer later when context volume forces the issue. You almost never need to add it upfront.
04 / OrchestrationKeep it boring.
There is a strong temptation to over-engineer orchestration. Multi-agent frameworks, complex message buses, dynamic agent routing. Like with everything in software development, the teams that succeed start simple.
The simplest orchestration pattern that works for the SDLC is a sequential pipeline with human checkpoints, backed by your existing ticket system as the state machine. The ticket's status field is the orchestration. When a ticket moves to "Ready for Design," the design agent activates. When a PR is opened, the validation agents run. When validation passes and a human approves, the release pipeline fires.
Use your ticket system as the orchestrator. Ticket status transitions are your events. CI/CD pipelines are your agent runners. Git branches provide isolation for parallel agent work. Message queues (SQS, Kafka, and so on) are for when you outgrow this, not for when you start.
When you do need more sophisticated orchestration, a simple message queue gives you everything: decoupled phases, retry logic, dead letter queues for failed agent runs, and the ability to scale agents independently. But most organizations are not there yet, and pretending you are will cost you more in complexity than it saves in efficiency.
05 / Org DesignYour structure already works. Don't break it.
There is a recurring pattern in how companies approach AI adoption. A leadership directive arrives: we need to be AI-first. An AI enablement team is formed, or an existing platform team is given the mandate. Within weeks, that team is drowning. Every department wants something. Product wants AI-generated specs. Engineering wants coding agents. QA wants automated test generation. Ops wants intelligent alerting. The enablement team becomes a bottleneck, trying to build bespoke AI solutions for every team in the company while everyone waits.
This is the wrong model. It fails for the same reason it fails when a platform team tries to build every internal tool themselves instead of providing the platform that lets other teams build what they need.
The agentic SDLC framework does not require a new organizational structure. It requires the existing one to work the way it already should. Product people still own planning. Designers and architects still own design. Engineers still own building and validating. SREs still own releases and monitoring. What changes is not who does the work, but what tools they have.
The AI enablement team (or platform team, or developer experience team) should not be building agents for every phase of the SDLC. Their job is to provide the infrastructure: the LLM access layer, the cost controls, the observability tooling, the secure execution environments, and the guardrails. Then each domain team configures and steers the agents for their own phase, because they are the ones with the domain knowledge to do it well.
A product manager knows what makes a good PRD for their domain. A senior engineer knows what patterns belong in their codebase and what does not. An SRE knows what healthy looks like for their services. No central AI team can replicate that knowledge. What the central team can do is make it easy for those experts to plug their knowledge into agent workflows: prompt templates, system instructions, tool access, and evaluation criteria.
This is the same model that made platform engineering successful. The platform team does not write your CI pipelines. They give you the CI system, the runners, the security scanning integration, and the deployment targets. Your team writes the pipeline that fits your service. The agentic SDLC works the same way. The enablement team provides the agent infrastructure. Your team provides the expertise that makes the agents useful.
The organizational benefit is that domain knowledge stays where it belongs: distributed across the people who actually do the work. The community of practice between product managers, the design review culture, the engineering guild that maintains coding standards, the on-call rotation that knows production inside out. None of that needs to be replaced. All of it becomes the guidance layer that makes agents effective rather than dangerous.
06 / The Manual PartsThe humans are the guidance system, not the bottleneck.
The most counterintuitive insight about the agentic SDLC is this: the manual parts are not the weakness. They are the strength. Human approval gates exist because judgment, accountability, and compliance cannot be automated away.
Agents work well with good guidance. They work poorly without it. The humans in the loop are not bottlenecks to be optimized away. They are the guidance system. A plan that no product person has read will produce the wrong feature. Code that no engineer has reviewed will accumulate debt. A deployment that no SRE has approved will eventually take down production.
As agents produce more output faster, the risk shifts from "not enough automation" to "too many approvals rubber-stamped." If your humans are approving 50 agent-generated MRs a day without reading them, you do not have an agentic SDLC. You have an automated one with a decorative human in the loop. Design your approval workflows so that the number of decisions stays manageable and each one gets real attention.
07 / ObservabilityYou cannot manage what you cannot see.
When agents do the work, humans need to see what happened. This is not optional. Observability in the agentic SDLC covers three layers.
Orchestration. Which agents ran, in what order, with what inputs, and what they produced. This is your audit trail.
Operation. How each agent behaved: token usage, latency, tool calls, error rates, retries. This is your cost and reliability signal.
Output. The quality of what agents produced: were the tests meaningful, was the code reviewable, did the PRD capture the intent? This requires human evaluation, at least for now.
The operational layer is especially important for cost control. Multi-agent setups can consume 4 to 15 times more compute than single-turn workflows. Without per-agent cost tracking, teams discover they have a problem only when the invoice arrives.
08 / SecurityAgents need sandboxes too.
Every agent that writes code, runs tests, or deploys changes needs a secure execution environment. The security model should follow the principle of least privilege: agents start with read-only access and earn write permissions through demonstrated reliability and appropriate human oversight.
A planning agent that reads tickets and writes documents is low risk. A build agent that pushes to Git is medium risk. A release agent that deploys to production is high risk. Each level demands different isolation, different approval flows, and different monitoring. This is a deep topic that deserves its own treatment — check out my colleague Emir's writeup on the topic — but the key principle is: treat agent permissions like you treat employee permissions. Scope them tightly and audit them continuously.
09 / Where To StartOne phase, measured, then the next.
You do not need to build all six phases at once. Start where the technology is most mature and the organizational risk is lowest. Most teams that succeed follow a sequence like this.
Build and Validate first
Code generation and automated testing are the most mature agent capabilities. Start here. Measure cycle time and defect rates. This is also where the validation harness lives, and where you will get the largest single quality jump.
Then Plan
Structured spec generation from tickets. This improves the quality of everything downstream. A good PRD agent quietly raises the floor on every later phase.
Then Monitor
Anomaly detection and incident triage. These agents work best with historical data, so give them time to learn your system before relying on them.
Then Design and Release
These require the most organizational trust and the deepest system context. Earn that trust with the earlier phases first.
10 / The Work AheadFaster stages. Tighter feedback. Same people.
The agentic SDLC is not a product you buy or a switch you flip. It is a way of thinking about where AI fits into the work your organization already does. The phases are the same ones you have always had. The people are the same ones who have always done the work. What changes is the tooling between them: agents that carry context forward, automate the repetitive, and surface what needs human attention.
The organizations that will get this right are the ones that resist two temptations. The first is to automate everything and remove humans from the loop. The second is to centralize everything and build a single AI team that becomes a bottleneck for the entire company. The better path is to distribute the capability, keep the expertise where it lives, and invest in the infrastructure that makes agents safe, observable, and accountable.
Start with one phase. Measure what changes. Expand when you have evidence, not enthusiasm. The SDLC has survived every technology shift for decades because its structure reflects how software actually gets built: in stages, with feedback, by people who care about the outcome. Agents do not change that. They just make the stages faster and the feedback tighter.
The people still matter most. Build accordingly.