On October, 30 2025, I had the opportunity to speak at ODSC AI San Francisco, one of the world’s leading AI conferences. The talk focused on a topic that has become central to the evolution of modern AI systems: Agent Optimization. While “agents” have become the buzzword of the year, the industry is still wrestling with the same chronic challenges: unpredictable behavior, brittle context management, constant tinkering, and expensive fine-tuning cycles. My talk unpacked why these issues persist, and why a shift toward systematic optimization, beyond context engineering alone, is essential for building reliable Agentic AI.
This blog post summarises the key ideas from my session, references the full slides, and adds deeper analysis for readers interested in building future-proof agentic systems.
Short Summary
In my talk, I explored why agentic AI has captured so much attention yet remains so difficult for companies to adopt in practice. Businesses can clearly see the potential. Agents promise to be value multipliers and accelerators, capable of transforming workflows and unlocking new efficiencies. But when organisations run their first pilots, they quickly encounter the same problems. Prompts behave unpredictably, static context engineering breaks down, results are inconsistent and the systems cannot be trusted in production. As new models, tools and protocols emerge at a rapid pace, teams face the real risk of vendor lock-in and fragile architectures that crumble with every change. MIT’s recent prediction that 95 percent of agentic AI projects will fail highlights just how unprepared the current generation of agent systems is for the real world.
Under the hood, today’s agents are often thin software layers over brittle prompts and oversized contexts. This leads to endless tinkering, non deterministic behaviour and engineering pain. The core issue is that most teams treat prompts as fixed instructions rather than dynamic components that need continuous evaluation and improvement. True progress requires full stack agent optimization, where every element of an agent’s context is tuned over time, including prompts, RAG pipelines, memory layers, tools, model choices and communication protocols. Emerging open source work like DSPy and GEPA shows the industry is beginning to move from prompt guessing to systematic optimization.
This philosophy is at the heart of what we are building at Superagentic AI. With SuperOptiX AI, we designed a modular and future proof framework that avoids lock-in and lets organisations define their agent specification once, then compile, evaluate and optimize across any model, framework or tool. Instead of offering off the shelf SaaS agents, we take a forward deployed approach by running private agents directly inside the enterprise environment. This allows teams to explore, test, and refine their agents safely, with no data leaving their premises, and prove real business value before scaling up. Agentic AI requires a new engineering discipline and a new way of thinking. By embracing full stack optimization, we can finally build agents that are reliable, adaptable and ready for the future.
Talks Video
Talks Slides Slideshare
Core Concepts
Now, here are some core concepts that explained in the talk..
Agent Optimization: Bringing Reliability to the Future of AI Systems
At this talk, I wanted to address a challenge that every practitioner in the room had experienced firsthand. Modern AI agents have become shockingly capable, yet they continue to behave in ways that make engineers hesitate before trusting them with real production responsibilities. This tension between impressive results and unpredictable outcomes formed the central theme of my talk. It is also the foundation of the work we are building at Superagentic AI, described in more detail at, and through our framework SuperOptiX .
The State of Agents Today
During the talk, I shared a series of slides that captured the current landscape of agent development. Despite tremendous model advances, teams still face agents with inconsistent reasoning paths, context chains that balloon to unmanageable sizes, reinforcement learning pipelines that are expensive and fragile, and fine tuning processes that rarely solve the underlying problems. Many engineering teams describe their agent workflows as cycles of endless adjustments. These systems often appear to work one week and then fail the next, sometimes triggered by nothing more than a subtle provider update.
This instability is not a reflection of weak engineering or lack of effort. It is a direct consequence of a deeper issue. The industry has spent much of the last two years attempting to fix agent behavior by improving the inputs. This began with the era of prompt engineering and quickly expanded into context engineering. Teams introduced retrieval augmentation, built toolchains, added custom memory architectures, and experimented with multi agent patterns. These advancements absolutely improved capability, yet the unpredictability remained. Context engineering made agents more informed, but not more consistent.
Why Context Engineering Is Not Enough
This idea formed the core argument of my article titled Agent Optimization: Why Context Engineering Is Not Enough. Context engineering influences what information enters an agent, but it does not directly influence how the agent internalizes its instructions, how it evolves across iterations, or how it performs in the presence of noise, new tasks, or shifting model behavior. Without a system that can measure, tune, and improve agent behavior, context engineering becomes a never ending patchwork of fixes.
As agents become more complex, with dozens of tools, layered memory, and evolving sub agents, the limits of context engineering become obvious. Adding more context sometimes leads to richer reasoning, but it just as often produces confusion, unnecessary planning, or unexpected tool calls. Reducing context makes reasoning cleaner but increases the risk of missing essential details. No purely input level strategy can deliver the reliability production systems demand.
The Role of Optimization and the Shift to Programmable Agents
This is why the focus of the field is shifting toward optimization. In the talk, I highlighted the work emerging from DSPy, which reframes agent development as a programmable and optimizable discipline rather than a manual one. DSPy encourages developers to define explicit objectives, collect measurable traces, and evolve prompts and reasoning strategies based on feedback. This approach treats prompts as tunable parameters rather than handcrafted text. It supports reflective evolution techniques that frequently outperform classical reinforcement learning approaches, while also being significantly more stable and cost effective.
Optimization places structure around how agents learn to behave. It provides metrics that reflect task performance. It produces traces that reveal where reasoning paths fail. It enables evaluators that act like the tests and checks that traditional software engineering has relied on for decades. Most importantly, optimization introduces a repeatable cycle that improves agents not through intuition but through measurable progress.
Building Future-Proof Agentic Systems
This shift toward optimization has an important implication for the future of agentic systems. In the talk, I emphasized that teams should avoid locking themselves into any single model provider. The pace of change in foundation models is accelerating, and the teams that thrive will be those who build systems that can adapt to new capabilities, new model families, and new research directions without costly rewrites. A modular, evaluation driven, model agnostic architecture is the only sustainable path forward.
This belief shapes the work we are doing at Superagentic AI. Our goal is not to produce yet another agent framework but to rethink how intelligent systems should be built from the ground up. At Superagentic AI, we describe our philosophy of evaluation first, optimization core, and orchestration native design. This philosophy is reflected in our open source framework SuperOptiX , which offers RSpec style testing for agents, a DSL for defining expectations, structured memory and model management, and orchestration patterns suitable for multi agent architectures. SuperOptiX offers teams a low friction entry point while still enabling optimization driven workflows as their systems mature.
A New Chapter for Agentic AI
The response at ODSC AI San Francisco made it clear that early excitement around agents has now matured into a desire for stability, predictability, and engineering discipline. We are moving into a chapter defined not by larger context windows or more elaborate workflows, but by a more rigorous approach to building and refining intelligent systems.
Agent optimization is the missing foundation that turns agents from interesting experiments into dependable infrastructure. It transforms behavior from variable to consistent. It makes agentic systems robust in the face of new models and new research. And it provides the path toward truly reliable AI.
The Future Must Be Modular
New models, new tools, new protocols, new research, it’s all coming fast. The worst mistake an AI team can make right now is locking themselves into a vendor or rigid architecture. We need modular, future-proof agentic systems that adapt as the ecosystem evolves.
That’s exactly the philosophy behind Superagentic AI and our open-source framework SuperOptiX: a low-friction, low-risk way for teams to adopt agentic AI today—while gaining a clear path to structured optimization tomorrow.
A Full-Stack Approach to Agentic AI
Our goal at Superagentic AI is simple: Help teams build agentic systems that are stable, adaptable, and designed for the real world—not just demos. SuperOptiX brings together evaluation-first design, prompt + context optimization, and multi-agent orchestration into a framework that avoids lock-in and embraces the rapid pace of innovation in AI research.
Key Takeaways from the ODSC Talk
1. Prompt engineering is not enough
2. Agent optimization is the missing foundation
3. Multi-agent systems need orchestration, not ad-hoc scripts
4. Model-agnostic design wins
5. Evaluation must be the foundation
6. Full Stack Agent Optimization unlocks repeatability
Final Thoughts
The talk from the ODSC AI San Francisco talk highlight the core challenges facing modern AI agents, showing how today’s systems often behave unpredictably due to exploding context pipelines, brittle reinforcement learning, and costly fine-tuning that fails to address deeper architectural issues. They trace the industry’s evolution from simple prompt engineering to increasingly complex context and agentic context engineering, yet emphasize that these methods still leave fundamental reliability problems unsolved. The presentation shows that the future of agentic AI depends on optimization rather than more context, pointing to approaches like DSPy/GEPA that treat prompts and agent strategies as programmable, tunable parameters supported by metric traces and systematic evaluation. The talk conclude that agent systems must be modular, future-proof, and free from vendor lock-in, and introduce Superagentic AI and SuperOptiX as platforms built around evaluation-first, optimization-core principles designed to deliver stable, production-ready, next-generation agentic architectures.
