We had an agent generating Lambda handlers for an order processing service decent code, honestly better than I expected. The problem was w...
We had an agent generating Lambda handlers for an order processing service decent code, honestly better than I expected. The problem was we couldn’t validate any of it locally because the handler was making direct DynamoDB calls with no abstraction layer. Every test cycle meant a real deployment. The feedback loop was somewhere between 8 and 15 minutes depending on CloudFormation’s mood that day. The agent kept iterating on top of unvalidated code. By the time we caught the first real error, there were four more layered on top of it.
That’s the actual shape of the problem with agentic AI development on AWS right now. It’s not that the agents write bad code sometimes they do, but that’s not the bottleneck. The bottleneck is that most cloud architectures were designed around human patience, and humans will tolerate a 10-minute deploy cycle because they go get coffee. Agents don’t get coffee. They just keep generating.
Closing the Feedback Loop: Local Emulation Before Cloud
The fix we kept coming back to was local emulation as the primary feedback path. AWS SAM CLI with sam local start-api lets you run Lambda and API Gateway locally, so the agent can generate a handler and actually invoke it in under a second. Pair that with DynamoDB Local for any CRUD validation and you’ve removed the two most common reasons an agent has to wait on real infrastructure. For ETL work, AWS Glue publishes Docker images that let you run PySpark jobs locally against sample data we haven’t used this as much but it’s the same idea. The agent gets a real response, not a timeout.
The other half of the problem, which gets less attention, is whether the codebase itself is legible to the agent. A flat repo with inconsistent naming and business logic scattered across Lambda handlers is hard for a human to navigate. It’s worse for an agent, because at least a human will ask a question when confused. The agents we’ve worked with will just fill in the gaps with plausible-looking code.
Codebase Structure and the Stuff We’re Still Figuring Out
Honest take on hexagonal architecture: the full philosophical argument for ports and adapters is a bit oversold, and I’ve seen teams spend more time debating it than building. But the directory structure part keeping /domain, /application, and /infrastructure as actual separate layers genuinely helps. The agent stops trying to call DynamoDB from inside a business rule function because the structure makes it obvious that doesn’t belong there.
Steering files in Kiro (kiro.dev) are a newer thing we’re still getting a feel for you put architectural constraints in .kiro/steering/ and the agent checks against them before generating. It’s promising. Whether it holds up at scale we don’t fully know yet. Same with AGENT.md files a simple markdown doc that explains the repo’s intent and conventions. Probably the cheapest thing you can do that actually helps.
For environments, ephemeral stacks via AWS CDK give you disposable, isolated infrastructure per branch so agents aren’t sharing state across experiments. We’ve found this cleaner than a long-lived staging environment, though the cold-start cost on CDK synth is real and occasionally annoying.
The CI/CD side of agentic AI is honestly still a bit unsettled for most teams, ours included. The progressive autonomy model suggestions first, then AI-submitted PRs, then auto-merge on passing gates makes sense in theory. In practice, deciding exactly where the human checkpoint belongs is more context-dependent than any framework suggests.
This is the kind of thing we dig into with teams at atxsoft.com happy to talk through it if your setup is anything like what I described.
References
- AWS Architecture Blog — Architecting for Agentic AI Development on AWS


