Monday, 12:00. Grace, the CEO of ACME Corp, just finished her Q2 leadership meeting. The team decided it's time to build an integration with a major platform. An AI agent picks up the meeting summary and creates the first tickets. Grace's VP of R&D approves them over Slack and kicks off development.
At 14:00, Alan, the PM, gets notified about the ticket he's been assigned. The planning agent has already picked up the details and is offering a few courses of action. Alan picks one, nudges the PRD agent in the right direction, and an hour later a PRD is ready and new tickets are created for the team.
At 16:00, Ada, the team's architect, is having her discussions over the right system design. It's going to require extended thinking and several sub-agents. It's good those agents work overnight.
Tuesday, 10:00. Ada has had her coffee and it's decision time. Designs are ready, and it's time for the engineers to pick them up.
Tuesday, 11:00. John is a backend engineer and he's picking up the new integration tickets. The coding agent codes, the testing agent writes the tests, and John points out a few issues. Around 14:00 the pull request is ready to be reviewed. A review agent prepares the review and John's team points out some missed edge cases. By the end of the day the PR is ready to be merged.
Wednesday, 10:00. The new integration is ready to be tested in the staging environment. At 14:00, the testing agents find some issues and the team fixes those. By Thursday morning it's time to release.
Thursday, 10:00. The team is working with canary releases, so the release agent verifies that key KPIs are solid before full production releases. Other tickets will go through the experimentation agents, but not this one.
17:00. It's been a full day, and the team gets notified the change is stable. Time for a full production release.
On Friday morning, the team fully releases the feature. For the next couple of weeks the monitoring agents will pay attention to business and technical metrics. If something comes up, new tickets will be opened, and the full cycle will repeat.
There's nothing in this scenario that cannot be done today, this is the new baseline for a healthy SDLC, so how do you get there?
The new age of software
By now it's clear the nature of software engineering is undergoing a fundamental change. For decades, the industry focused on making humans more efficient at doing their job. We are now entering an era where the primary "workers" will be autonomous agents, and the primary role of the human is to make decisions.
This is not speculative. Top-tier engineering organizations are already retooling for this. Stripe is deploying Minions for one-shot end-to-end coding tasks. Ramp built background agents to handle the repetitive toil that usually slows teams down. We see companies like Cursor moving toward cloud agents and a broader industry push toward background agents where models are "upstream", constantly proposing changes, and our pipelines are "downstream," acting as the high-fidelity filters for quality.
The data supports this: a METR study from early 2025 shows AI agents already performing at the level of experienced open-source developers on complex tasks. Industry leaders like Andrej Karpathy and Michael Truell point to a future where programming is less about syntax and more about managing these agents. As Boris Cherny put it, the shift is already here.
The software development lifecycle has always been a series of handoffs. A ticket becomes a plan, a plan becomes a design, a design becomes code, code becomes a release. At each step, context is transferred from one person or team to the next, and some of it is inevitably lost. AI agents are about to change the mechanics of these handoffs. Not by replacing the people involved, but by carrying context forward more reliably, doing the repetitive parts faster, and freeing humans to focus on the decisions that actually require judgment.
We're still in the very early days and engineers are ahead of the rest of the business but this is about to change. This article proposes a framework for thinking about where agents fit, what triggers them, what context they need, and critically, where humans interactions happen. Warning: It is opinionated(think about humans in the loop, handoffs, and feedback loops) where it should be and flexible(choose your own tools!) where your organization needs it to be.
The six phases
The SDLC is not new. What changes with agents is how work flows between phases. Each phase has one primary agentic workflow, triggered by events and gated by human approvals where compliance or judgment demands it.
Click any phase to explore its agent workflow. The vertical flow shows forward progression. Purple pills mark feedback loops where work can return upstream.
Feedback loops
The pipeline above reads top to bottom, but real work does not always move forward. Feedback loops are how the system self-corrects. Some loops are short and frequent, happening many times within a single ticket. Others are long, spanning the entire lifecycle when monitoring discovers something that needs new work.
Short loops (within a ticket)
These happen constantly and are often fully automated. They are the inner engine of quality.
Long loops (across the lifecycle)
These close the full cycle. They are the reason the SDLC is a loop and not a line.
The key design decision is: which feedback loops should agents handle autonomously, and which require human judgment? A good default: short loops within Build-Validate (test failures, lint fixes) can be automated. Everything else should notify a human and wait for a decision.
Context accumulates
One of the hardest problems in the agentic SDLC is context management. Each phase produces artifacts that downstream phases need. By the time you reach Release, an agent needs access to the ticket, the plan, the design artifacts, the code changes, the test results, and the compliance evidence.
Use the slider to move through the phases and see how context accumulates. New artifacts produced at each phase are highlighted.
Context accumulation across phases
The practical approaches to managing this context range from simple to complex. Most teams should start at the simple end:
Advanced: A shared state store (vector DB, knowledge graph) where each phase deposits structured artifacts. Agents query semantically for relevant context rather than receiving everything. This becomes necessary when the volume of artifacts exceeds what fits in an agent's context window.
Start with tool access. Let agents pull what they need from the systems that already hold the information. You can always add a shared state layer later when context volume forces the issue. You almost never need to add it upfront.
Orchestration: keep it boring
There is a strong temptation to over-engineer orchestration. Multi-agent frameworks, complex message buses, dynamic agent routing. Like with everything in software development: the teams that succeed start simple.
The simplest orchestration pattern that works for the SDLC is a sequential pipeline with human checkpoints, backed by your existing ticket system as the state machine. The ticket's status field is the orchestration. When a ticket moves to "Ready for Design," the design agent activates. When an PR is opened, the validation agents run. When validation passes and a human approves, the release pipeline fires.
When you do need more sophisticated orchestration, a simple message queue gives you everything: decoupled phases, retry logic, dead letter queues for failed agent runs, and the ability to scale agents independently. But most organizations are not there yet, and pretending you are will cost you more in complexity than it saves in efficiency.
Your org already has the right structure
There is a recurring pattern in how companies approach AI adoption. A leadership directive arrives: we need to be AI-first. An AI enablement team is formed, or an existing platform team is given the mandate. Within weeks, that team is drowning. Every department wants something. Product wants AI-generated specs. Engineering wants coding agents. QA wants automated test generation. Ops wants intelligent alerting. The enablement team becomes a bottleneck, trying to build bespoke AI solutions for every team in the company while everyone waits.
This is the wrong model. It fails for the same reason it fails when a platform team tries to build every internal tool themselves instead of providing the platform that lets other teams build what they need.
The agentic SDLC framework does not require a new organizational structure. It requires the existing one to work the way it already should. Product people still own planning. Designers and architects still own design. Engineers still own building and validating. SREs still own releases and monitoring. What changes is not who does the work, but what tools they have.
A product manager knows what makes a good PRD for their domain. A senior engineer knows what patterns belong in their codebase and what does not. An SRE knows what healthy looks like for their services. No central AI team can replicate that knowledge. What the central team can do is make it easy for those experts to plug their knowledge into agent workflows: prompt templates, system instructions, tool access, and evaluation criteria.
This is the same model that made platform engineering successful. The platform team does not write your CI pipelines. They give you the CI system, the runners, the security scanning integration, and the deployment targets. Your team writes the pipeline that fits your service. The agentic SDLC works the same way. The enablement team provides the agent infrastructure. Your team provides the expertise that makes the agents useful.
The organizational benefit is that domain knowledge stays where it belongs: distributed across the people who actually do the work. The community of practice between product managers, the design review culture, the engineering guild that maintains coding standards, the on-call rotation that knows production inside out. None of that needs to be replaced. All of it becomes the guidance layer that makes agents effective rather than dangerous.
The manual parts matter most
The most counterintuitive insight about the agentic SDLC is this: the manual parts are not the weakness. They are the strength. Human approval gates exist because judgment, accountability, and compliance cannot be automated away.
Agents work well with good guidance. They work poorly without it. The humans in the loop are not bottlenecks to be optimized away. They are the guidance system. A plan that no product person has read will produce the wrong feature. Code that no engineer has reviewed will accumulate debt. A deployment that no SRE has approved will eventually take down production.
Observability: you cannot manage what you cannot see
When agents do the work, humans need to see what happened. This is not optional. Observability in the agentic SDLC covers three layers:
Operation: How each agent behaved: token usage, latency, tool calls, error rates, retries. This is your cost and reliability signal.
Output: The quality of what agents produced: were the tests meaningful, was the code reviewable, did the PRD capture the intent? This requires human evaluation, at least for now.
The operational layer is especially important for cost control. Multi-agent setups can consume 4 to 15 times more compute than single-turn workflows. Without per-agent cost tracking, teams discover they have a problem only when the invoice arrives.
Security: agents need sandboxes too
Every agent that writes code, runs tests, or deploys changes needs a secure execution environment. The security model should follow the principle of least privilege: agents start with read-only access and earn write permissions through demonstrated reliability and appropriate human oversight.
A planning agent that reads tickets and writes documents is low risk. A build agent that pushes to Git is medium risk. A release agent that deploys to production is high risk. Each level demands different isolation, different approval flows, and different monitoring. This is a deep topic that deserves its own treatment(check out my colleague's Emir writeup on the topic), but the key principle is: treat agent permissions like you treat employee permissions. Scope them tightly and audit them continuously.
Where to start
The agentic SDLC is not a product you buy or a switch you flip. It is a way of thinking about where AI fits into the work your organization already does. The phases are the same ones you have always had. The people are the same ones who have always done the work. What changes is the tooling between them: agents that carry context forward, automate the repetitive, and surface what needs human attention.
The organizations that will get this right are the ones that resist two temptations. The first is to automate everything and remove humans from the loop. The second is to centralize everything and build a single AI team that becomes a bottleneck for the entire company. The better path is to distribute the capability, keep the expertise where it lives, and invest in the infrastructure that makes agents safe, observable, and accountable.
Start with one phase. Measure what changes. Expand when you have evidence, not enthusiasm. The SDLC has survived every technology shift for decades because its structure reflects how software actually gets built: in stages, with feedback, by people who care about the outcome. Agents do not change that. They just make the stages faster and the feedback tighter.
The people still matter most. Build accordingly.